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Abstract 


We  present  a  new  method  for  approximating  the  i^artition  function  of  2D  Ising  models  iising 
a  transfer  matrix  of  order  2".  For  n  =  .{0  our  current  program  took  about  20  seconds  on 
a  Sparc  station  to  obtain  4  correct  decimals  in  the  top  two  eigenvalues  and  5  minutes  for 
(i  correct  decimals.  Eigenvectors  were  computed  at  the  .same  time.  The  temperature  was 
within  .3%  of  critical. 

The  main  idea  is  to  force  certain  entries  in  vectors  to  have  the  same  values  and  to  find 
the  crudest  representation  of  this  type  that  delivers  the  recpiired  accuracy.  At  no  time  does 
our  program  work  with  vectors  with  2"  entries. 
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1  Introduction 


The  Ising  model  was  proposed  to  ex])lain  properties  of  ferromagnets  Ijiit  since  then  it  has 
found  application  to  topics  in  (’hemistry  and  Biology  as  well  as  in  Physics.  For  any  reader 
unfamiliar  with  the  model  we  say  a  few  words  and  supply  some  references  in  Section  9.  The 
remainder  of  this  section  assumes  some  knowledge  of  the  so  called  transfer  matrix.  This 
paper  presents  a  numerical  method  for  computing  properties  of  the  2D  Ising  model  for  given 
parameter  values  such  as  magnetic  field  strength  B.  temperature  T  and  coupling  constants 
J. 

There  are  two  avenues  leading  to  such  calculations:  combinatorial  and  algebraic.  Our 
method  is  in  the  second  category  which  makes  u.se  of  a  transfer  matrix  M^^  associated  with 
a  semi-infinite  helical  grid  of  “sj)ins"  or  "sites"  with  »  of  them  on  each  circular  band.  One 
form  of  M„  for  »  =  3  and  n  =  4.  with  the  field  strength  B  normalized  with  respect  to  the 
coupling  constant  ./  is  as  follows: 


i 


where  (with  appropriate  normalizations) 

The  attractive  property  of  .U„  is  that  it  is  a  nonnegative  irreducilrle  matrix  whose 
dominant  eigenvalue  (called  the  Perron  root)  is  the  wanted  partition  function  j)er  sirin. 
Thus  it  is  only  necessary  to  approximate  this  eigenvalue  to  the  desired  accuracy  although 
the  associated  eigenvectors  are  also  useful  in  approximating  quantities  of  physical  interest. 
Moreover  M„  is  exceedingly  sparse:  it  has  exactly  2  non-zero  entries  ])er  row  (and  column) 
arranged  in  a  regular  i)a.ttern.  There  is  only  one  difficulty:  d/„  is  of  order  2”  and  we  are 
interested  in  the  case  n  —  oc.  We  know  of  no  calculations  with  n  >  ‘20  up  till  now. 

Our  approach  uses  two  finite  families  and  {Zi./};=]  of  orthogonal  indirial  vec¬ 

tors.  and  approximates  the  top  two  column  and  row  eigenvectors  oi  M„  from  the  sul)spaces 
spanned  by  them. 

Step  0.  Initialize  /  to  1. 

Step  1.  Represent  in  compact  form,  the  orthogonal  projection  P  of  the  transfer  matrix  .1/,, 
onto  the  sulispace  span(2',i./ ).  addition  represent  the  projection  Q  of  the  adjoint 
matrix  M~  onto  the  subsjjace  span(T,j./). 

Step  2.  Compute  the  two  largest  eigenvalues  and  the  associated  row  and  column  eigenvec¬ 
tors  of  P  and  Q.  These  are.  in  a  .sense,  the  best  approximations  from  the  aiven  pair  of 
indicial  sul)spaces  spau(5n.;)  spanjT,,.;).  However  they  may  not  be  good  enough. 

Step  3.  Evaluate  residual  norms,  condition  numbers  and  as.sociated  error  bounds  and  es¬ 
timates.  If  the  estimates  are  satisfactory  then  compute  the  required  properties  of  tlr 
model  and  stop.  Otherwi.se  return  to  Step  1  with  the  next  member  of  each  family,  i.e. 
increa..se  /  by  1. 

Our  goal  is  to  creep  up  to  the  coarsest  of  our  vector  representations  that  j)ermits  ap¬ 
proximations  of  the  desired  accuracy.  This  minimal  representation,  which  is  not  known  in 
advance,  gave  us  the  name  for  our  approach. 

.Note  that  the  difficulty  lies  not  in  M„  itself  but  in  the  the  representation  of  vectors 
in  R*" .  Itideed  the  special  structure  of  M„  would  permit  evaluation  of  .!/„(’  for  any  2"- 
dimensional  vector  r  with  great  efficiency.  However  a  procedure  that  costs  0(2")  may  be 
too  much  when  »  is  large  and  our  central  problem  is  the  representation  of  vectors  in  R'". 

Sparse  vpctors  occur  in  sparse  matrix  work  and  N.  Fuchs  [FucSO],  when  applying  the 
Power  .Method  to  .)/„.  keeps  only  the  large.st  1000  entries  of  each  vector.  This  device  is 
satisfactory  deep  within  the  ferromagnetic  region  of  the  model.  However  after  studying 
the  Perron  vector  in  cases  near  the  critical  temperature  we  found  that  it  contained  almost 
no  small  entries.  In  different  language,  every  configuration  in  the  "spin"  array  contriliutes 
significantly  to  the  partition  function. 
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As  a  substitute  for  si)arsity  we  propose  to  limit  the  number  of  distinct  values  that  can 
occur  among  a  vector  s  components.  We  do  this  by  means  of  a  family  of  "indicia]  function^“. 
Full  details  are  given  in  Section  3  Init  here  we  sketch  the  idea. 

vector  in  R‘"  may  be  thought  of  as  a  function  on  {1.2 . 2"}.  What  we  call  an 

indicial  function  is  really  a  partition  of  this  iiule.x  set  into  disjoint  subsets  on  each  of  which 
the  vector  is  constant.  Thus  the  vector  takes  on  fewer  than  2”  distinct  values,  perhaps  onl\' 
a  few  million  of  them.  This  sort  of  vector  recalls  H.  Lebesgue's  approach  to  integration 
via  step  functions.  For  a  given  partition  /  the  set  of  all  represental)le  vectors  forms  a 
subspace  S/  of  R*”.  We  bow  to  the  influence  of  computer  science  and  start  counting  at 

0.  If  {co . denotes  the  standard  basis  and  if  {15.93.214.800}  is  one  subset  in  the 

partition  /  then  Cjs  +  C93  +  ^214  +  ‘'suu  is  one  member  of  a  natural  orthogonal  basis  for  S/. 
In  other  words,  the  natural  basis  vectors  of  R*"  are  aggregated  according  to  /  to  ])rodnce 
an  orthogonal  basis  of  S/.  .An  important  feature  of  our  approach  is  that  these  i)asis  vectors 
are  never  represented  e.X])Iicitly  in  the  coniituter.  Careful  inde.x  nianiitulation  takes  their 
place.  Moreover  our  choice  of  /  yields  a  manageable  representation  of  the  projection  //  of 
Mr,  onto  Sf.  Pj  is  nonnegative  and  irreduciide.  Pf  is  not  as  sparse  as  M„  but  we  hold  it 
in  a  compact  form  that  pertniis  the  efficient  formation  of  Pjir  for  approjtriate  ir. 

There  is  some  freedom  in  the  choice  of  the  family  of  /'s.  Our  /'s  are  a  couiproinise 
between  physics  and  the  very  sjtecial  structure  of  M„.  Full  details  are  given  in  Sections  3. 
4.  and  5. 

The  next  task  is  to  find  the  Perron  vectors  of  Pf.  Recall  that  the  top  two  eigenvalues 
of  ,\/„  coalesce  as  the  temperature  becomes  critical.  We  have  used  two  apitroaches: 

(a)  a  block  power  method  with  a  block  size  of  2. 

(b)  a  nonsymmetric  Lanczos  code. 

The  details  are  given  in  Section  6.  It  turns  out  that  it  pays  to  compute  the  two  largest 
eigenvalues  together  with  their  column  and  row  eigenvectors.  The  reason  that  conventional 
techniques  such  as  these  are  appropriate  is  that  with  our  current  indicial  functions  /  (and 
/').  dim  Sj  =  Ol  n^2‘~^  )  and  so  Pj  is  of  modest  order.  In  addition  we  form  and  compute 
similar  quantities  for  Q  >'■  the  (orthogonal)  projection  of  .1/^  onto  an  associated  subspace 
Sf.  The  extra  information  from  Qf  allows  us  to  compute  an  approximate  Perron  row 
vettor  (/”  to  match  the  Perron  cohnun  vector  .r  for  Pf.  Pf  and  Qf  share  the  same  Perron 
root.  Fortunately  Qfi  is  diagonally  similar  to  Pf  and  need  not  be  represented  explicitly. 

We  would  prefer  to  use  the  oblique  projection  of  .!/„  onto  the  pair  of  subspaces  (Sf.Sf) 
but  we  have  not  yet  found  a  convenient  (sparse)  repre.sentation  because  some  of  the  canonical 
angles  between  Sf  and  Sf  equal  tt {2  and  this  fact  complicates  the  representation. 

■Associated  with  the  vectors  x  [PfX  =  xtr)  and  y"  (y'Qf  =  ~y‘)  are  vectors  Zf  t  Sf 
and  ii'j  £  Sf  that  approximate  the  eigenvectors  we  seek.  It  is  essential  to  be  able  to  bound 
or  estimate  the  accuracy  of  our  approximate  eigentriple  (~f.Zf.  iiQ). 
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Fortunately  l)v  using  our  special  liases  in  Sf  and  Sf  aiipropriately  we  can  compute 
(exactly  in  exact  arithmetic)  the  associated  residual  vectors 

i-f  :=  .\J„Zf  -  s/'  :=  -  Wf'Trf 

and 

-7  :=  ''•/-//ni''7lUI|;/il2). 

Although  vj  €  R^”.  ■‘>f'  €  R*"  we  can  accumulate  ||r/|p  and  |i-s/'i|'  and  u'/  during  the 
computation  of  c/  and  iiy,  and  thus  avoid  ever  having  to  store  them.  This  is  a  key  feature 
of  the  efficiency  of  our  method.  From  HrH.  |j.s(|.  and  we  can  compute  error  hounds  and 
error  estimates.  This  is  discussed  in  Section  7. 

It  is  likely  that  our  error  estimates  indicate  that  Zf.  ay-  and  ~j  are  not  sufficiently  accu¬ 
rate.  In  that  case  we  pick  the  next  indicia!  function  /  in  our  family  so  that  /  is  a  refiiieinent 
of  /  and  Sj  C  Sj.  dim  Sj  ~  2  dim  Sf.  Then  we  repeat  the  cycle  of  ajiproximations  until 
the  accuracy  requirement  is  met  oi'  our  resources  are  exhausted.  This  is  not  an  iteiative 
method  hecau.se.  in  a  finite  numher  of  steps,  the  indicia!  function  hecomes  the  identity. 

By  creeping  up  to  adetiuate  apiiroximations  from  helow  we  ensure  that  we  end  uj)  with 
the  coarsest  indicial  function  that  meets  the  given  tolerance.  In  this  way  do  we  achieve  the 
minimal  representation,  from  our  family,  that  gives  our  method  its  name. 

It  is  worth  repeating  that  at  no  time  in  the  cycle  do  we  need  to  store  a  vector  with  2" 
components. 

Quantifies  of  interest  are  usually  partial  derivatives  of  the  partition  function.  If  we  used 
differences  to  estimate  derivatives  that  would  sharjily  increase  the  retpiired  accuracy  of  our 
approximations.  Fortunately  .S.  Gartenhaus  [GarS3]  and  N.  Fuchs  [FucSQ]  have  shown  that 
some  of  the  quantities  of  interest  may  he  expressed  in  terms  of  r  and  tr“  and  so  there  is  no 
need  to  use  differences.  This  increa.ses  the  .scope  of  our  approach  significantly. 


2  Basic  Notation  and  Terminology 

We  will  follow  Householder's  conventions:  upper  case  Roman  letters  for  matrices,  lower  case 
letters  for  column  vectors,  and  lower  case  Greek  letters  for  scalars.  However,  the  letters 
i.  j.  k.  1.  m.  n  and  t  will  be  reserved  for  integers.  All  matrices  and  vectors  v'iH  be  real. 
The  transpose  of  A  will  be  denoted  by  A",  and  the  inner  product  of  vectors  x  and  y  by 
{x,y)  =  x’y.  We  will  exclusively  use  the  Euclidean  norm  for  vectors:  IjxH  =  x/Px. 

As  the  theory  behind  our  indicia!  subspaces  is  intimately  connected  with  the  binary  rep¬ 
resentations  of  numbers,  we  will  index  the  rows  and  columns  of  a  matrix,  and  the  elements 
of  a  vector,  starting  from  0.  unless  otherwise  specified.  Thus  for  A  €  and  x  e  R”’. 

[A)t,j  denotes  the  entry  in  row  i  and  column  j  of  A,  0  <  7  <  /  -  1,  0  <  j  <  m  -  1.  and  x(i) 
denotes  the  element  of  2:.  0  <  ?  <  m  —  1. 

The  /  X  m  zero  and  identity  matrices  will  be  written  as  Ouw  and  /;xm  respectively:  the 
2"  X  2"  identity  matrix  will  be  written  as  I„.  For  Q  €  R'^'".  we  let  span(Q)  denote  the 
subspace  of  R^  spanned  by  the  m  columns  of  Q.  Similarly,  if  S  is  a  set  of  vectors  in  R^  the 
subspace  spanned  by  these  vectors  will  be  denoted  by  span(cS). 

The  symbol  :=  will  denote  a  definition,  and  the  symbol  □  will  mark  the  end  of  a  proof. 

By  a  (binary)  string  u.-.  we  shall  mean  a  finite  sequence  of  Os  and  Is.  The  empty  string 
is  denoted  by  s.  We  write  {0. 1}"  for  the  set  of  all  strings  (including  5).  and  {0.  l}"  for  the 
set  of  77-bit  strings,  r?  >  1.  The  length  of  a  string  is  denoted  by  |u;|.  the  concatenation  of 
two  strings  ^1  and  w2  by  0  Cv'o  •  and  the  reversal  of  a  string  u7  (i.e.  written  backwards) 
by 

eg.  {0.1}^  =  {000.001.010.011.100.101.110.111}. 

|001101|  =  6,  010  o  110  =  010110.  001101^=101100. 

We  also  define  Isj  :=  0.  :=  s.  and  s  0  u.’  =  .*;  o  £■  :=  w’  for  any  string  -jJ. 

For  a  nonempty  string  w.  we  denote  its  A**  bit  from  the  left  by  w{i].  i  =  1.2 . |w'(. 

Thus  u;  =  u;(  1 )  o  a;(2 )  o  •  •  •  o  |w;| ),  and  |u7| )  o  u2(  |u7|  —  l)o  •••Ou7(l).  For  given  I . 

1  <  ^  <  I-''!-  tfie  /-bit  prefix  of  is  the  substring  vj(l)  o,*;(2)  o  •  •  •  0  and  the  /-bit  suffix 
of  w  is  the  substring  ^’(|wi7|  —  /  +  1 )  o  w’(  — /-(-2)  o  ■  •  •  o  |u7| ).  The  empty  string  ;  will  be 

considered  to  be  the  0-bit  prefix  and  the  0-bit  suffix  of  any  string  We  shall  also  refer  to 
-..•(  1 ).  -.■■(  1 )  Owj(2)  and  (^'| )  as  the  leading  bit,  leading  bit  pair  and  trailing  bit  respectively 
of  a  string  w'  with  |>jj  >  2. 

There  is  a  natural  correspondence  between  binary  strings  and  the  nonnegative  integers 
N  arising  from  the  concept  of  the  binary  representation  of  numbers.  We  formahze  this  by 
defining  two  functions: 

r  :  {0.  1}  —  N  mapping  w’  6  {0.  l}*  to  the  integer  value  it  represents  (7’(c )  :=  0). 
and  for  r?  >  1. 

:  {0. 1 . 2"  -  1}  —  {0.  1}”  mapping  i  €  N  to  its  7?-bit  binary  representation. 


We  note  that  there  is  no  uniqueness  in  these  maps:  two  different  strings  may  have  the  same 
value  under  v.  eg.  i'(Oll)  =  =  3.  and  a  nonnegative  integer  is  mapped  to  different 

strings  under  different  cr„'s,  eg.  (72(3)  =  11.  f73(3)  =  Oil.  This,  however,  should  cause  no 
confusion.  We  can  extend  r  to  sets  of  strings:  t’(E)  =  ^  6  E}.  E  C  {0.  l}".r)  >  1. 

Two  other  properties  of  binary  strings  that  are  of  interest  to  us  are  their  1-bit  counts 
and  bit  transition  counts. 

eg.  000000  has  no  Is.  111111  has  six  Is.  and  101101  has  four  Is. 

000000  and  111111  have  no  bit  transitions.  001111  has  one  bit  transition. 
101101  has  4  bit  transitions,  and  010101  has  5  bit  tran.sitions. 

We  note  that  for  ^  €  {0.  l}".  has  at  most  n  Is  and  at  most  n  —  1  bit  transitions. 


3  The  subspace  span(e>„  /)  and  its  uses 

spaii(<5n./)  consists  of  ail  vectors  in  R^"  constrained  to  carry  the  same  value  in  various  places 
(or  positions).  Each  position  is  given  by  a  bit  string  of  length  n:  the  zeroth  (top)  position 
is  represented  by  00. .  .0  and  the  (2"  -  l)th.  or  last,  position  is  given  by  11. . .  1. 

1.  A  typical  set  of  those  positions  (i.e.  strings  of  length  n)  that  carry  the  same  value  is 

characterized  by  a  triple  (uj.k.t)  and  is  called  an  indicial  set  and  is  denoted  by 

u.;  is  the  common  /-bit  suffix  of  the  positions  (i.e.  u;  is  the  substring 
consisting  of  the  last  1  bits). 
k  is  the  common  1-bit  count  of  the  positions,  and 
t  is  the  common  bit  transition  count  of  the  positions. 

Here  k  and  t  reflect  the  physics  but  it  is  that  exploits  the  form  of  M„. 

2.  To  each  distinct  corresponds  the  vector  ^  t  obtained  by  summing  those  columns 

of  the  2"  X  2"  identity  matrix  whose  indices  belong  to  the  set  In  addition  Sn.i  = 

:  I'^l  =  0-  The  columns  of  the  2"  x  |5„./|  matrix  A'/  are  the  x^,k.t  appropriately 
ordered.  By  our  choice  A'/  has  orthogonal,  but  not  orthonormal,  columns: 

X'Xt  =  Dx  which  is  diagonal. 

Note  that  Sn.i  and  the  columns  of  A'/  are  the  same  sets,  the  latter  are  ordered.  |5n./i  = 
0(7?^2^“M  for  large  n. 

3.  The  vectors  x^_k.t  are  never  represented  explicitly.  Our  choice  exploits  the  duodiagonal 

structure  of  the  transfer  matrix  M„  so  that 

Mn(Snj)  C  (5„./+i). 

One  useful  consequence  is  that  the  orthogonal  projection  of  A/„  onto  span(5n./).  writ¬ 
ten 

:=  D^KX^MnXi 

has  at  most  4  non-zero  entries  per  column.  More  precisely.  is  the  representation 
of  Mn's  projection  in  our  basis  given  by  A'/'s  columns,  in  order.  The  orthogonal 
projector  onto  span(»S„./)  is 

4.  .Among  other  things  wp  compute  the  dominant  eigenpair  {‘n.g]  of 

^n.lS  =  5’!'.||5||oc  =  l,ff(/)  >  O.all  i. 

The  notation  tt  reminds  us  that  it  is  'ue  Perron  root  of  the  matrix  P^j.  Our  method 
defines  {~.g).  where 

g  =  Xig. 
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as  an  approximation  to  the  true  dominant  eigenvector  of  M„.  However  we  need  to 
estimate  the  quality  of  g  and  this  requires  the  computation  of  MnQ-  By  Remark  3. 

Mng  =  C  span(5„./+i) 

for  an  appropriate  coefficient  vector  h.  We  can  compute  h  from  g  without  invoking 
any  2"-vectors  at  all. 

In  order  to  realize  the  method  outlined  above  several  technical  difficulties  must  be  re¬ 
solved.  Step  2  requires  an  ordering.  Step  3  requires  a  compact  representation  for  the  projec¬ 
tion  matrix,  and  Step  4  requires  various  inner  products  of  vectors  that  are  not  represented 
in  a  conventional  way. 
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4  Theory  of  indicial  subspaces 

4.1  Indicial  sets,  vectors  and  bases 

We  begin  by  defining  the  building  blocks  of  our  indicial  subspaces.  Each  such  subspace  is 
obtained  by  forcing  two  vector  components  to  have  the  same  value  if  the  binary  represen¬ 
tation  of  their  indices  have  the  same  number  of  I's,  the  same  number  of  bit  transitions  and 
the  same  /-bit  suffixes,  where  /  is  a  fixed  nonnegative  integer.  By  grouping  the  indices  of 
equal-valued  components  together,  we  obtain  a  partition  of  the  collection  of  indices,  and  an 
associated  basis  for  the  subspace. 

We  first  illustrate  the  ideas  with  a  simple  example  with  suffixes  of  length  1.  The  space 
we  shall  work  in  is  and  we  shall  regard  indices  as  5-bit  binary  strings.  Consider  the 
subspace  C  of  obtained  by  forcing  two  components  to  have  the  same  value  if  their 
indices  have  the  same  1-bit  count,  the  same  bit  transition  count  and  the  same  trailing 
bit.  By  grouping  the  indices  of  equal-valued  components  together,  we  obtain  a  partition 
of  {00000 . 11111}  into  sets  of  strings,  as  shown  in  the  fourth  column  of  the  example 
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below. 


trailing  bit 

#  is 

#  transitions 

indicial  sets 

indicial  vectors 

0 

0 

0 

{00000} 

Co 

0 

1 

1 

{10000} 

0 

1 

2 

{00010.00100.01000} 

^2  +  64  +  Cs 

0 

2 

1 

{11000} 

C24 

0 

2 

2 

{00110.01100} 

ce  +  c\2 

0 

2 

3 

{10010.10100} 

^18  +  C20 

0 

2 

4 

{01010} 

Cio 

0 

3 

1 

{11100} 

C28 

0 

3 

2 

{OHIO} 

ei4 

0 

3 

3 

{10110.11010} 

^22  +  C26 

0 

4 

1 

{11110} 

C30 

1 

1 

1 

{00001} 

Cl 

1 

2 

1 

{00011} 

C3 

1 

2 

2 

{10001} 

Cl7 

1 

2 

3 

{00101.01001} 

Ch  +  C9 

1 

3 

1 

{00111} 

e-r 

1 

3 

2 

{10011.11001} 

Cl9  +  C25 

1 

3 

3 

{01011.01101} 

Cll  +  Ci3 

1 

3 

4 

{10101} 

C2I 

1 

4 

1 

{01111} 

Cl5 

1 

4 

2 

{10111.11011.11101} 

C23  +  ^27  +  ^29 

1 

5 

0 

{11111} 

C31 

We  call  each  member  of  the  partition  an  indicial  set.  To  each  indicial  set  I.  we  can  associate 
an  indicial  vector,  which  has  ones  in  positions  whose  indices  are  in  I.  and  zeros  elsewhere. 

Each  indicia]  vector  is  therefore  a  sum  of  vectors  from  the  standard  basis  {to . fai}- 

These  vectors  are  shown  in  the  last  column  of  the  example.  A  moment's  thought  will  reveal 
that  they  form  a  ba.sis  for  the  subspace  C. 

Formally,  we  define 

Definition  4.1.1  Let  n>l.Q<k<n.  0<t<n-l,  and  ui  £  {0.  l}*  with  |w|  <  n. 
Define 

I'f.k.t  ■—  {m  €  {0-  ]}”  •  M  has  k  Is  and  t  hit  transitions,  and  is  the  suffix  of  i.i}. 

called  a  suffix-based  indicial  set.  For  a  given  u;  €  {0.  l}"  with  |u.'!  <  n.  we  call  a 
pair  k.  t  legitimate  if 
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eg-  lU.O={0000}.  It.3,2=  {1011.  1101}.  1^12,2  =  {1001}. 

We  leave  it  as  an  exercise  for  the  reader  to  verify  that  Ii  3  3  =  0- 
Definition  4.1.2  Suppotie  E  C  {0,  l}",  r?  >  1.  Define  the  vector  X£  £  R^"  by: 

f  1  if  <T„(?)  €  E 

a;E(*  '■=  \  0  <  /  <  2"  -  1. 

^  lO  ifar^df^E  -  - 

XE  (considered  as  a  function  on  integers)  can  be  regarded  as  the  characteristic  function  \e 
0/ E.  If  E  is  a  suffix-based  indicial  set,  i.e.  E  =  I"^. ,  for  some  u.-.  k  and  t.  we  call  X£ 
a  suffix-based  indicial  vector  and  we  also  write  it  as  Note  that  =  0.  and  that 

IkElP  =  IEI- 

Definition  4.1.3  Let  n  >  1,  0  <  /  <  n.  Define 


Sn.t  :=  {a'",i..(  :  Ml  =  h.t  legitimate}. 

An  order  will  be  imposed  on  Sn.i  in  Section  5.1  and  so  we  call  Sn.t  on  indicial  basis  and 
spa.n(Sr,.i)  an  indicial  subspace. 

In  discussions  where  the  value  of  n  is  assumed  fixed,  we  will  omit  it  in  writing  indicial 
objects.  Thus,  we  write  and  instead  of  I" j. ,  and  x”  ^  ,  respectively. 

In  an  analogous  fashion,  we  can  define  prefix-based  indicial  sets  J"  /.  ,•  vectors  y}}  f.  ^  and 
bases  7^,/.  It  turns  out.  however,  that  prefix-based  indicial  objects  can  be  derived  from 
corresponding  suffix-based  ones.  This  will  be  explored  in  Section  4.4. 

W'e  note  here  for  future  analyses  some  fundamental  properties  of  suffix-based  indicial 
sets,  vectors  and  bases.  W’e  urge  the  reader  to  go  through  them  carefully. 


Fundamental  Properties 

Proposition  4.1.1  For  fixed  n  and  pl,  the  nonempty  indicial  sets  ,  are  disjoint.  Thus 
for  0  <  I  <  n,  Sn.t  is  an  orthogonal  set  (i.e.  for  x.y  £  Sn.t-  ^  y  ^  y  =  0,-^  and  so  is 
linearly  independent. 

Proposition  4.1.2  For  fixed  n  and  |.4,j.  the  collection  of  nonempty  indicial  sets  j. ,  is  a 
jxirtition  of  {0.  l}".  Thus  for  each  0  <  i  <  2"  —  1.  there  is  a  unique  x  £  Sn.t  with  x{i)  =  1. 

Proposition  4.1.3  The  trailing  bit  and  the  parity  of  the  bit  transition  count  t  of  a  nonempty 
string  p  determines  its  leading  bit. 

Proof.  .A.n  even  number  of  transitions  (viewing  from  the  right  end  of  p  to  its  left  end) 
preserves  the  traihng  bit  whereas  an  odd  number  of  transitions  reverses  the  traihng  bit.  □ 

Proposition  4.1.4  The  strings  in  each  indicial  set  I”  r?  >  2.  u;  e.  all  have  the  same 
leading  bit  since: 

(a)  the  trailing  bit  of  .4.-  determines  the  trailing  bit  of  each  M  €  I”  fc  r 

(b)  by  Proposition  4- 1-3.  if  t  is  even,  the  leading  bit  of  each  /z  €  I”  ,  must  be  the  same 
as  its  trailing  hit:  if  t  is  odd.  the  leading  bit  mu.st  be  different. 
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4.2  Action  of  duodiagonal  matrices  on  indicia!  bases 


Our  primary  goal  is  to  analyze  the  structure  of  the  column  projection  matrix  i  and  of  the 
row  projection  matrix  P^i-  This  requires  us  to  understand  the  action  of  the  transfer  matrix 
Mn  on  basis  vectors  x  in  Sn.i  for  /  >  1.  In  this  section,  we  shall  see  how  to  decouple  the 
action  of  A/„.  and  thus  express  MnX  as  a  linear  combination  of  indicial  vectors.  Section  4.3 
then  analyzes  the  structure  of  P^j  and  P^j  for  /  >  1. 

Let  n  be  a  fixed  integer  >  3.  and  /  be  a  fixed  integer  >  1.  Recall  that  P^j  =  X" Mr,X 
where  the  columns  of  X  are  vectors  in  Sn.t-  aiid  =  A'"A'.  To  elucidate  the  action  of  Mr, 
on  A',  we  introduce  a  general  class  of  duodiagonal  matrices  f  of  which  M„  is  a  special 
case.  We  first  illustrate  it  for  n  =  4: 


In  general,  we  consider  the  2"  x  2"  duodiagonal  matrix  defined  below; 
Definition  4.2.1  Let  n  be  an  integer  >  3.  The  2"  x  2”  duodiagonal  matrix 


“3  n  \ 

“3  “3  /.  ' 


where 


1^(0) 


x2”-* 


ug  0 

fio)  _  “o  0 

0  ul 

_  0  ug 
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[t{D  - 
^  n 

■  r(ri 

fril) 

0 

6  Il2"-'x2"-^ 

r<l) 

1 

0 

0 

uj 

.  0 

[MD 

.  0 

uf  _ 

■  fri2) 

0  ■ 

U2 

0 

r<2)  = 

Cri2) 

II 

U2 

0 

0 

Uj 

.  0 

t-(2)  _ 

0 

uf 

n 

■  [ri3) 

{'(3) 

0  ■ 

^  j^2”->x2"-^_ 

p(3)  _ 

“3 

ul 

0 

0 

0 

4 

.  0 

fri3) 

0 

4, 

The  column  of  Un  is  denoted  by  Uf..  k  =  Q,l, _ 2"  —  1. 

The  transfer  matrix  M„  is  equal  to 

(a  a\(bb\(  a~^  a~^  \  /  b~^  b~^  \ 

^"[[b  b  )''[c  c  )''[b-^  b-^  )  '  c-^  )/ 

Our  goal  in  this  section  is  to  show  that  the  result  of  applying  I'n  to  an  indicial  vector 
can  be  expressed  as  a  linear  combination  of  2  or  4  indicial  vectors.  Before  beginning  the 
analysis,  we  illustrate  the  ideas  by  working  out  the  action  of  f  's  on  some  of  the  basis  vectors 
in  c^s.i  in  the  example  below.  The  columns  of  the  32  x  32  identity  matrix  I5  will  be  denoted 

by  {eo-^i . ^31}.  We  urge  the  reader  to  go  through  the  example  carefully,  and  to  observe 

in  each  case  how  the  result  of  applying  f’s  to  an  indicial  vector  could  be  expressed  as  a 
linear  combination  of  2  or  4  indicial  vectors. 


eg.  (fl)  Iq  1  2  —  {00010. 00100. OlOOOj.  .Tq  j  2  —  ^2  4"  ^4  4"  €§. 
f  5^0. 1.2  =  f  5^2  4-  I  5C4  +  (  568 
=  U2  4-  U4  +  Us 

=  ( iiot4  4-  )  4-  ( U0C8  4-  UgCg)  +  (  UjCie  4-  ) 

=  Uo(e4  4-  es)  4-  Uglcs  +  eg)  4-  njCie  +  ujeir 
=  ^0^00,1.2  4'  “0^01.2.3  4"  -^00. 1.1  4"  *^i"roi.2.2 

(6)  I0.2.3  ~  (lOOlO.  10100}.  Xq  2,3  =  ^18  4-  C20' 

f  ■5^0.2.3  =  ^18  4-  «20 

=  («5c4  4-  u\es)  4-  (M2f8  +  “2^9) 

=  00. 1.2  4-  W2-roi,2.3 
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(  C )  ~  {lOlll,  11011.  lllOl}.  3;^  ^  2  —  ^23  +  ^27  +  ^29- 

^  ''’•^1,^.2  ~  ^'23  +  *'27  +  **29 

=  (M2^14  +  **2*15)  +  (**3^22  +  **3^23)  +  (U3t26  +  <*3^27) 

=  U2^io,3.2  +  **2^11.4.1  +  "3-*’l0.3.3  “3-*^ll,4.2 

id)  1^3.1  =  {00111}.  xt3.i  =  e7, 

^  52'i,3.i  =  u- 

=  **0^14  +  **0^15 
=  10,3,2  +  “o^n.4,1 

Efficient  processing  with  duodiagonal  matrices  r„  depends  on  the  foUowing  key  obser¬ 
vations  regarding  their  nonzero  entries: 

(a)  the  nonzero  entries  of  Uk  occur  exactly  at  positions  2k  mod  2"  and  {2k  +  1)  mod  2". 

(b)  the  parameters  uj‘  and  (i  =  0.1.  j  =  0.1. 2. 3)  are  the  nonzero  entries  of 

i‘2”-^j  +  2k+i^  0  <  A.’  <  2"~^  -  1. 

eg.  for  f's,  the  parameters  and  Uj  (i.e.  i  —  1.  j  =  1}  are  the  nonzero 
entries  of  Mg.  mh.  ui3  and  uis. 

Equivalently,  the  parameters  uj'  and  are  the  nonzero  entries  of  Uf;.  for  those 

k  €  {0.1 . 2"  -  1}  where  the  leading  bit  pair  of  (7„(fc)  is  (72(J)  (the  2-bit  binary 

representation  of  j)  and  the  trailing  bit  of  <Tn(k}  is  <T](f)  (the  1-bit  binary  representa¬ 
tion  of  i).  In  the  above  example,  the  only  o-bit  strings  with  leading  bit  pair  01  and 
trailing  bit  1  are 

<75(9)  =  01001,  ct5(  11  )  =  01011.  <75(13)  =  01101  and  (75(15)  =  01111. 

Since  our  indicia!  subspaces  are  characterized  using  bit  strings,  we  would  like  to  charac¬ 
terize  the  positions  of  the  nonzero  entries  of  f'„  similarlv.  In  particular,  we  need  operators 
on  bit  strings  that  correspond  to  multiplying  by  2  (and  then  possibly  adding  1)  modulo  2”. 
Th<  operators  2>j  and  2w  +  1  defined  below  accomplish  that.  We  urge  the  reader  to  read 
the  definition  with  care. 

Definition  4.2.2  For  a  nonempty  string  the  strings  2^  and  2^'  +  1  are  defined  by: 

2wV  :=  -2(2)  o  u;(3)  o  •  •  •  o  u2(  |u,'| )  0  0 
and  2w' -(- 1  :=  w(2)  o  **>(3)  o  •  •  •  o  ,.j(  |w-| )  o  1. 

It  IS  easily  verified  that  v{2^')  =  2n(^')  mod  2^"!  and  v(2^-  -I-  1)  =  (2r(w')  -)-  1)  mod  2l'*'L  IVe 
extend  the  definition  to  sets  of  strings:  for  E  C  {0. 1}".  n  >  1. 

2E  {2^'  €  E}  and  2E  -)-  1  :=  {2»k.’  -f  1  :  wj  €  E). 
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Note  that  for  a  nonempty  string  u,-.  2^'  and  2u;+  1  are  suffixes  ofw'oO  and  w'o  1  respectively. 
Proposition  4.2.1  restates  observations  (a)  and  (b)  in  terms  of  bit  strings. 


Proposition  4.2.1  Let  0  <  A'  <  2"  -  1.  and  let  /i  =  anik).  The  nonzero  entries  of  are 
and  where  i  =  i'(ij,{n))  and  j  =  r(/i(  1 )  o  ^(2)).  and  they  occur  at  positions  v{2p) 

and  v{  2p  +  1 )  respectively. 

We  remind  the  reader  that  for  a  string  p  6  {0.  l}".  v(p)  denotes  the  integer  value  it 
represents. 

We  now  consider  the  action  of  on  a  special  kind  of  indicia!  vector.  Suppose  E  C 
{0.  l}"  is  such  that  all  its  strings  have  the  same  leading  bit  pair  pi  and  the  same  traihng 
bit  pf  Recalling  that  a  matrix- vector  product  is  a  linear  combination  of  the  columns  of  the 
matrix  with  coefficients  given  by  the  entries  of  the  vector,  we  see  that 

/f  where  k  where 

X£(fe)  =  1  crn(fc)  6  E 

From  Proposition  4.2.1.  the  Uk's  involved  in  the  vector  sum  have  the  same  two  parameters 
and  (where  i  =  v(pt)  and  j  =  v(pt))  as  their  nonzero  entries,  and  those  entries 

have  indices  in  i’(2E)  and  r(2E-|-  1)  respectively.  We  have  estabhshed  the  following: 


Theorem  4.2.2  (cf.  Theorem  2.3.2.  [HenOl])  Let  E  C  {0.  l}".  n  >  Z.  be  such  that  all  its 
strings  have  the  same  leading  bit  jxiir  pi  €  {0.  l}^  and  the  same  trailing  bit  pt  €  {O.l}^. 
Then 


UnXE  =  Wj'J-2E  +  «;‘^’^2E+1 


where  i  =  v(pt)  and  j  —  v(pi). 


Theorem  4.2.2  suggests  that  for  a  nonempty  indicia!  set  E  =  (pl  =  /).  the  result 
I’nXE  can  be  decomposed  by  partitioning  the  strings  in  E  according  to  their  2"'^  leading 
bit.  We  formalize  this  below. 


Definition  4.2.3  Let  E  C  {0.  l}'^.  n  >  2.  ITe  define 

E°  :=  {w  €  E  :.*;(2)  =  0} 
and  E^  :=  {w- €  E  : -;(2)  =  l}. 

i.e.  E°  and  E^  are  subsets  of  strings  in  E  having  2"'^  leading  bit  0  and  1  respectively.  It 
follows  from  the  definition  that  E  is  the  disjoint  union  o/E®  and  ET 


eg.  E  =  {0011.0100.0101.1000}.  E°  =  (0011. 1000}.  E^  =  (0100.  OlOl}. 

E  =  {0010.0011}.  E°  =  E.  E*  =  0. 

Theorem  4.2.3  fcf.  Theorem  2.1.7.  [HenQl])  Let  E  =  I"  >  2.  a,-  7^  c.  Then  |2E°|  = 
|2E°  +  1|  =  |E®!  and  |2EM  =  |2Ei  +  1|  =  \E^\. 
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Consider  now  applying  f'„  to  the  indicial  vector  €  Sn.i-  Let  E  =  Then  the 

strings  in  E°  have  the  same  leading  bit  pair  and  the  same  trailing  bit  since  Proposition  -1.1.4 
determines  the  leading  and  trailing  bits,  and  the  2''‘*  leading  bit  is  0  by  definition.  Similary. 
the  strings  in  E^  have  the  same  leading  bit  pair  and  the  same  trailing  bit.  Thus  E°  and  E^ 
each  satisfy  the  conditions  of  Theorem  4.2.2. 

Corollary  4.2.4  Let  E  =  PI  —  1.  be  a  nonempty  indicial  fiei.  and  let  pi  denote  the 

leading  bit  of  strings  in  E.  Then 


f  n^E  ~  ^  j;0  4"  ^  n^E^ 

where  i  =  t?(..j(/)).  j  =  v{pi  o  0)  and  j'  —  v{pi  o  1). 


(1) 


The  vectors  appearing  in  (1)  are  in  fact  suffix-based  indicial  vectors,  as  Theorem  4.2..^ 
shows. 

Theorem  4.2.5  (cf.  Theoiem  2.1.8.  [Hen9l])  Let  E  =  I"  i. n  >  .3.  0  <  |-’|  <  n.  Then 
2E°.  2E°  1.  2E^  and  2E^  -|-  1  are  themselves  suffix-based  indicial  sets,  and  there  exist 
bkt  =  ±1  and  St,  =  ±1.  i  =  0.  1.2.3.  such  that: 


2E^  =  LoO  .k'^dk2.t'^St2  —  l2w./c+51;2  (2  * 

-f-  1  “  ^wol.fr4-5fc3.t+^t3  —  ^2-/*4-l.if4’(5/r3.('4’<5?3  * 

Furthermore,  the  pairs  (Sk,.St,).  i  =  0. 1.2.3.  are  distinct,  and  so  2E°.  2E°  4-  1.  2E^  and 
2E^  -I-  1  are  pairwise  disjoint. 


Proof.  We  will  not  prove  the  theorem  here,  but  instead  give  the  values  of  Sk,  and  St,  for 
different  cases. 

a)  trailing  bit  of  is  0  and  t  is  even: 


2E”  = 


2E°  -1-  1  —  • 

2E'  =i:o0...t-i-  2E1 -h  1  = 


b)  trailing  bit  of  is  0  and  t  is  odd: 


2E"  =  i:o0.*-i..-i- 
2E^  =  i:o0..-i.r 


c)  trailing  bit  of  is  1  and  t  is  even: 

OF®  —  T" 

-  ■‘u-oO./t-l.t’ 

or  1  —  T" 

-  ^^o0.k-\.t+l- 


2E''  -n  = 

2E‘  +  1  = 


2E<'  +  i  =  i;.,j,,,.,. 
2E‘  + 1  =  i;.,.,,. 
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d)  trailing  bit  of  is  1  and  t  is  odd: 

2E“  +  1  = 

2E‘  =  ■2E‘  +  I  =  o 

Note  that  if  =  0.  then  X2E0  =  ^2E°+i  ~  ^2E1  ~  ^2Ei-i-i  “ 

0.  However.  E°  and  E^  cannot  both  be  empty  since  E  =  E°  U  E^  and  E  is  nonempty.  In 
addition,  the  nonzero  vectors  among  X2£0-  ^2E0+i'  ^2E1  ^zE^  +  i  distinct  from  each 

other  since  2E°.  2E°  +  1.  2E^  and  2E^  +  1  are  pairwise  disjoint  by  Theorem  4. 2. .5.  So  I'n^'E 
is  a  linear  combination  of  either  2  or  4  suffix-based  indicial  rectors. 

We  can  in  fact  relate  I  riX£  to  the  vectors  in  the  basis  Sn.i-  Define  a  subv^ector  of  a  vector 
j  to  be  a  vector  obtained  by  setting  zero  or  more  entries  of  x  to  0.  i.e.  a  subvector  has  the 
same  number  of  entries  but  more  of  them  are  0.  From  Theorem  4.2.5.  :r2E0-  ■^■2E0+1'  •^'•’Ei 
and  3-2£i^i  are  subvectors  of  different  vectors  in  Sn.i  since  |2^|  =  |2i.^-  +  1|  =  /.  Thus  U„x-e 
is  a  hnear  combination  of  either  2  or  4  subvectors  in  Sn.i- 

To  recapitulate  our  analysis,  we  rework  the  example  given  at  the  beginning  of  the  section. 

eg.  (fl)  E  =  1^1 2  =  00100- 01000}-  E°  =  {00010.00100}.  E^  =  {01000}. 

2E°  =  {00100.01000}.  2E°  +  1  =  {00101.  OlOOl}. 

2E1  =  {10000}.  2E1  +  1  =  {10001}. 

f  5^E  =  *^0-^2EO  “o^2E0  +  1  ^1^2E1  "*■  “l'^2El  +  l 

(6)  E  =  152.3  =  {10010.10100}.  E°  =  {10010. 10100}.  E^  =  0. 

2E°  =  {00100.01000}.  2E°  +  1  =  {OOlOl. OlOOl}. 

f  5XE  *  ^2*^2E^4*1 

(c)  E  =  rj., 2  =  {K^iii- 11011- 11101}-  E°  =  {10111}.  E^  =  {non.  11101} 

2E°  =  {OHIO}.  2E°+1  =  {01111}. 

2E^  =  {10110. 11010}.  2E1  +  1  =  {10111.  non} 

f  5J:e  =  “2-f2E0  "2^2E0  +  1  "3-*’2E1  +  1 

(d)  E  =  rj  3,1  =  {00111}.  E°  =  {00111}.  E^  =  0. 

2E°  =  {OHIO}.  2E°  +  1  =  {01111}. 

i  O^E  —  “o-^2E0  *^0^2E0  +  1 

4.3  Structure  of  the  column  projection  matrix 

.■\rmed  with  our  understanding  of  the  action  of  duodiagonal  matrices  on  suffix-based  indicial 
bases,  the  analysis  of  the  structure  of  the  column  projection  matrix  P^j  becomes  straight¬ 
forward.  We  shall  show  that  for  /  >  1.  P^/  is  sparse  with  either  2  or  4  nonzero  entries  per 
column,  and  we  shall  precisely  locate  the  positions  of  those  entries,  and  express  their  values 
in  a  way  that  enables  them  to  be  computed  without  using  any  vectors  in  . 


As  before,  we  shall  work  with  the  general  duodiagonal  matrix  The  corresponding 
results  for  can  be  obtained  by  substituting 


Consider  the  projection  matrix  P  =  X’UnX .  where  the  columns  of  A’  are  the 
vectors  in  Sn.i  and  Dx  =  A'’A'.  We  index  the  rows  and  columns  of  P  by  the  triple 
where  x^x.t  €  Sn.i-  Then  the  entry  in  row  and  column  (^'.k.i)  of  P  is  given  by: 

7!  .f  •  ^  ~  Tj 

ll-'.A'.f'l 

Let  E  =  l^,k.i-  The  analysis  of  Section  4.2  shows  that  is  a  linear  combination  of  either 
2  or  4  subvectors  in  Sn.i  (with  each  subvector  arising  from  a  different  basis  vector).  Since 
the  vectors  in  Sn.i  have  pairwise  disjoint  supports  (cf.  Proposition  4.1.2).  P  is  sparse  with 
2  or  4  nonzero  entries  per  column  arising  from  the  nonzero  inner  products  (.r^<  f  'n 
Specifically,  column  (sj.k.t)  of  P  has  2  nonzero  entries  if  one  of  E°  and  E^  is  empty,  and 
has  4  nonzero  entries  if  both  E*^  and  E^  are  nonempty. 


Theorem  4.3.1  Lef  E  =  l-j  =  I.  bf  a  nonempty  indicia!  .set.  and  for  each  bit  string 

tn  E  denote  the  leading  bit  by  /p.  Let  Sk,.  ht,.  i  =  0. 1.2.3.  be  as  given  in  Theorem  T---L 
and  let  i  =  u(w(/)).  j  =  v{pi  o  0)  and  j'  =  v(pi  o  1).  IfSP  ^  0.  then  colvnin  (w.  k.  i)  of  P 
has  nonzero  entries  in  rows  ( 2^.  k  +  Sko- 1  +  hto)  and  ( 2w  +  l.k  +  hki  .t  +  6ti)  with  valves 


1 


i  Aq  .f +!? 


<0  I 


■{■^'2^.k 


4-i5Ao 


and 


O - ^ 

- - - rUj'lE^I  by  Theorem  4-2.3 

|l2w.A.*+dfco,(+<>to  I 


1 


11-2 


:  (^2wf- 1 . A+<5A]  1  •  f  )  ■” 


1 


,2i+l  117.01 


^2..^+ 1  .A+(5Ai  .t-eSti  I  !^2w+1.A+i^Aj  i 

respectively;  i/E^  ^  0,  then  column  (-jj.k.t)  has  nonzero  entries  in  rows  (2-J.k  +  bk2.t  +  bt2  ', 
and  (2.*,'  +  1.  A*  +  (‘'A'3.  t  +  <*»t3)  with  values 

1  .  .  1 


and 


|l2--'.A+-5A2.;+6t2  I 

1 


{X2u.k+6k:i  ,t+5t2  •  ^  ”'^e)  — 


t ^2u^  +  1  .ks6 kj  + 

respectively. 


■  (^2u>+l. k-SSka. t+6t3  ■  ^’n^E)  = 


|A2w.A+iA2.f+i^2l 


- ^ - rU^I'^^lE^l 

II2W+I.A+6A2  .#+^^2  I 


Note  that  in  Theorem  4.3.1.  we  could  precisely  locate  the  positions  of  the  nonzero  entries 
of  P.  More  importantly,  each  inner  product  (involving  two  vectors  in  R‘"  )  was  expressed 
as  a  product  of  an  entry  in  and  the  cardinality  of  a  suffix-based  indicia!  set.  Section  4. -5 
gives  formulas  for  those  cardinalities.  Thus  the  projection  matrix  P  can  be  computed  in 
time  proixtrtional  to  [.Sn  ;|  and  witlwut  using  any  vectors  in  R^". 
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4.4  Structure  of  the  row  projection  matrix 

The  analysis  of  the  structure  of  the  column  projection  matrix  could  be  adapted  to  the 
row  projection  matrix  P^i  =  Dy^Y' M^Y  by  considering  directly  how  multiplication  by 
affects  prefix-based  iudicial  vectors.  It  is.  however,  more  illuminating  to  exploit  a  duality 
between  suffix-based  and  prefix-based  indicia!  vectors  arising  from  reversal  of  bit  strings. 

Definition  4.4.1  Let  n  be  a  positive  integer.  The  binary  reversal  matrix 

is  the  matrix  whose  only  nonzero  entries  are  ones  in  row  i  and  column  j.  where  the  n-hit 

binary  representation  of  i  and  j  are  the  reversal  of  each  other,  i.e. 


( Rn  )i.j 


1  if  (Tnii)  =  {(Tn(j))^ 

0  otherwise 


0  <  ij  <  2"  -  1. 


0001  0011  0101  0111  1001  1011  1101  till 


0000  0010  0100  0110  1000  1010  1100  1110 


. 

1 

0000 

1 

0001 

1 

0010 

1 

0011 

1 

0100 

1 

0101 

1 

0110 

1 

0111 

1 

1000 

1 

1001 

1 

1010 

1 

1011 

1 

1100 

1 

1101 

1 

1110 

1 

1111 

R„  is  a  reflection:  it  is  symmetric,  involutary  (R\  —  1^)  and  orthogonal  {R'^Rr,  =  R^  =  In)- 
If  Cj  €  is  the  column  of  the  identity  matrix  /„.  then  Rr,€j  =  ej  where  we  write 
as  J.  Therefore  for  the  suffix-based  indicia!  vector  x^,k.t-  we  have  RnX^.k.t  — 
since  does  not  change  the  1-bit  count  and  the  bit  transition  count  of  an  n-bit 
string.  We  thus  have  a  one-to-one  correspondence  between  suffix-based  indicia!  vectors  in 
Sn.i  and  prefix-ba.sed  indicia!  vectors  in  Tn.i.  and  RnSn.i  =  :  x  G  Sn.i}  =  T„./.  By  a 

suitable  arrangement  of  the  columns  of  we  can  make  Y'  =  Rn-X .  Then 


since 


Dy  =  r*}-  =  (A-^*  )(^„A-)  =  X'(RlRn)X  =  A'-A  =  Dx- 

The  sparsity  structure  of  Rr.M'^Rn  is  identical  to  that  of  M„  (the  parameter  values  jier- 
mute).  as  Theorem  4.4.1  shows.  We  will  not  prove  the  theorem,  but  instead  illustrate  it  for 
n  =  4.  Note  that  for  a  matrix  B  €  BRn  is  a  permutation  of  the  columns  of  B  with 

columns  i  and  j  interchanged  if  t  =  J  (0  <  i,j  <  2"  -  1).  while  R^B  \s  d.  permutation  of 
the  rows  of  B  arising  from  the  same  interchange  of  rows.  In  particular,  the  remarks  hold 
for  V^R^  and  R^iV^R^)  respectively. 


0001  0011  0101  0111  1001  1011  1101  1111 
0000  0010  0100  0110  1000  1010  1100  1110 


“o  “o 

2 

11  ^ 

Uq  Uq 

2  3 

%  u5 

u°  uj 

u] 

U?  k| 

Uj  U.2 

U2 

11°  v] 

U2 

«3  «3 

2  3 

H  I's 

j 

“3  ^3 

0  0 

^3  ^^3 
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Theorem  4.4.1  (cf.  Theorem  2.4- 1.  [Hen9l]) 


Applying  Theorem  4.4.1  to  RnM’Rni  we  have 


and  so  has  the  same  structure  as  In  fact,  they  are  diagonally  similar  (see  Ap¬ 

pendix  A). 


4.5  Selected  combinatorial  results 

Before  we  state  without  proof  some  combinatorial  results  regarding  indicial  subspaces,  we 
present  some  selections  to  get  a  feel  for  the  quantitative  behavior  of  our  indicia]  subspaces. 
Table  1  gives  the  cardinality  of  the  indicial  basis  5„.;  and  the  maximum  size  of  an  indicial  set 
I"  (with  PI  =  /  )  for  /  =  2.4.  The  latter  quantity  is  by  definition  equal  to  the  ma.ximum 
number  of  ones  appearing  in  a  basis  vector  in  Sn.i-  We  see  from  the  table  that  the  indicial 
bases  have  cardinahties  which  are  very  small  compared  to  2”:  in  contrast,  the  basis  vectors 
have  a  large  number  of  nonzero  entries.  For  example,  for  n  =  .30  and  /  =  2.  we  approximate 
by  the  subspace  spanned  by  the  1628  basis  vectors  in  ^30.2,  the  ‘‘largest’"  of  which  has 
5.95  X  10®  ones  appearing  in  it. 


n 

2" 

■ 

l‘5n.4| 

k,  t 

1*^1  =  ^ 

5 

32 

28 

2 

32 

1 

10 

1024 

148 

20 

352 

6 

15 

32768 

3t 

400 

120 

20 

1.05  X  10® 

688 

8820 

2192 

2520 

25 

3.36  X  10' 

1108 

2.33  X  10® 

3712 

63504 

30 

1.07  X  10® 

1628 

5.95  X  10® 

5632 

Table  1:  Combinatorial  properties  of  suffix-based  indicial  sets 

In  the  theorems  below,  n  and  I  are  positive  integers  with  I  <  n. 

Theorem  4.5.1  (cf.  Theorem  3.1.2  and  Corollary  3.1.4-  [Hen9l])  The  cardinalities  of  all 
the  nonempty  1-hit  suffix-based  indicial  setsV^kt’  I**’!  ~  given  by: 
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(oj  if  =  0; 

l^o.o.ol  “ 

(iij  for  I  <  k  <  n  -  1,  I  <  i  <  iuiii(2A:, ■2{n  —  k)  —  1). 


(b)  if  ^  =  1; 


li)  =  !■ 

(iiJ  for  l<k<n-l.  l<i<  min(2A;  —  1.2(n  -  k)), 


|I 


n 

l.k.t 


< 


k  -  l  \  /  n  -  k  —  1  \ 

tf2  jV  t/2-l  ) 

k  —  1  '\  /  n  —  k  —  1  \ 

it  -  l)/2  )[  it -l)/2  ) 


if  t  even 


if  t  odd 


if  t  even 


if  t  odd 


Theorem  4.5,2  (cf.  Theorem  3.3.1.  [Hen9l])  Let^  be  a  fixed  string  of  length  1.  and  let  k' 
be  the  number  of  Is  in  ^(2)  o  ■  ■  •  o^il).  and  t'  be  the  number  of  bit  transitions  in  u,-.  The 
cardinalities  of  all  the  nonempty  indicial  sets  given  by: 

(a)  if^il)=:  0: 

=  1' 

(ii)  for  k  =  I . n  -  I,  t  =  I _ .iiim(2fc.2(n  —  I  —  k)  +  1), 

=  usir'i- 


(b)  =  1; 

fiij  for  k  =  1 _ ,n  -  1.  t  =  1 . rain(2/:  -  l.2in  -  I  —  k  +  1)), 

=  urJ'i. 


Theorem  4.5.3  (cf.  Theorem  8.3.2.  [Hen9l])  Let  .jv  be  a  fixed  string  of  length  1.  For 
n  =  I  -  l.l  (mod  4), 


max 

fc.t 


ir, 


k.t\ 


f  [{;,-/  +  l)/2j  -  2  \  /  lin-l+l)/2]  \ 
\  L(n-/+1)/4J  -1  JV  [(n-/+l)/4j  ) 
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and  for  n  =  I  +  l.l  +  2  (mod  4). 


n.a.vK,.|  =  (  +  R„-,+  l)/21-l 

t.i  [(„-;+l)/4j  )\  R„_;+i)/4j 

Theorem  4.5.4  (cf.  Theorem  3.3.3  and  Corollary  3.3.4-  [HenQl])  The  number  of  nonempty 
indicial  sets  I"  ^  f  with  |a;|  =  /  is 


and  so  are  the  cardinalities  of  the  indicial  bases  Sn.i  and  Tnj.  and  the  orders  of  the  projection 
matrices  and 


Theorem  4.5.5  (cf.  Theorem  3.3.5.  [Hen9l])  The  number  of  nonzero  entries  in  each  of 

P^l  Pn.t 
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5  Implementation  of  Indicial  Subspaces 


We  describe  in  Sections  5.1  through  5.5  an  implementation  of  one  cycle  (i.e.  for  fixed  «  >  3 
and  /  >  0)  of  our  method  of  minimal  representations.  The  foci  of  our  discussion  will  be 

(a)  the  data  structures  for  representing  indicial  sets  (with  |w’|  =  /)  and  corresponding 
vectors  in  the  approximating  subspaces  span(t9„./)  and  span(7^./).  and 

(b)  efficient  algorithms  for  manipulating  such  vectors. 

Although  a  vector  x  in  span(^n./)  has  2"  entries,  at  most  of  them  are  distinct  and  our 
algorithms  for  manipulating  these  vectors  (eg.  finding  ||a:||)  wiU  exploit  this  property.  Most 
of  them  will  take  time  0(|5n.;|)  =  0(2'“^n^).  We  emphasize  that  the  basis  Sn.i  is  never 
explicity  represented  in  our  implementation. 

5.1  Data  structures  for  indicial  sets  and  subspaces 

Total  ordering  on  indicial  objects  In  the  preceding  development  of  the  theory  of  indi¬ 
cial  subspaces,  the  ordering  of  the  nonempty  indicial  sets  with  p|  =  /  (or  equivalently, 
the  ordering  of  the  basis  vectors  in  and  Tn.i)  was  unimportant.  However,  any  imple¬ 
mentation  of  our  method  has  to  choose  an  exphcit  ordering  and  the  representation  of  the 
projection  matrices  and  depends  on  it. 

To  make  our  discussion  uncluttered,  we  work  with  triples  pl  =  /.  rather  than 

directly  with  indicial  sets  and  basis  vectors  x^^k.t. 

Definition  5.1.1  .4  triple  (^.k.t)  is  legitimate  if  the  corresponding  indicial  set  is 

nonempty  (or  equivalently,  x^,k.t  «  basis  vector  in  Sn.i)-  W  c  define  a  total  ordering  -</ 
on  legitimate  triples  as  follows:  (uJ.k.t)  -<j  (u;'.  Ar',  f').  if  and  only  if 

(a)  v{uj)  <  v(  sA/  ) .  or 

fb)  i’(w)  =  r(u;')  and  k  <  k' .  or 

(c)  v{^)  =  v(u;').  k  =  k'  and  t  <  t' . 

The  triple  in  this  ordering  will  be  denoted  by  (u;,.  Ay.  tj ).  0  <  ?  <  —  1.  For  a 

legitimate  triple  (^\k.t).  ^(u:.k.t)  denotes  its  ranking  under  -<;  (i.e.  $(u.’,.  Ay.  t, )  =  ij. 

We  display  the  ordering  for  n  =  4  and  1  =  2.  where  the  nonempty  indicial  sets 
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(PI  =  2)  are  singletons. 


fjJ 

k 

t 

0 

00 

0 

0 

{0000} 

1 

00 

1 

1 

{1000} 

2 

00 

1 

2 

{0100} 

3 

00 

2 

1 

{1100} 

4 

01 

1 

1 

{0001} 

5 

01 

2 

2 

{1001} 

6 

01 

2 

3 

{0101} 

7 

01 

3 

2 

{1101} 

8 

10 

1 

2 

{0010} 

9 

10 

2 

2 

{0110} 

10 

10 

2 

3 

{1010} 

11 

10 

3 

1 

{1110} 

12 

11 

2 

1 

{0011} 

13 

11 

3 

1 

{0111} 

14 

11 

3 

2 

{1011} 

15 

11 

4 

0 

{1111} 

Induced  ordering  on  prefix-based  indicia!  objects  The  ordering  of  the  legitimate 
triples  induces  an  ordering  of  the  nonempty  suffix-based  indicial  sets  (p|  =  /  )  and  the 
basis  vectors  in  Sn.i-  Thu.s  in  forming  the  column  projection  matrix  P^j  =  X' MnX 
where  Dx  =  A'* A',  the  columns  of  A'  are  the  vectors  in  Sn.i  in  the  prescribed  order.  VVe  note, 
however,  that  in  showing  that  the  projection  matrices  P^i  —  Dy^Y’ M~Y  (Dy  =  Y'Y] 
and  P^i  have  the  same  sparsity  structure  (Section  4.4)  we  exploited  the  duality  between 
suffix-based  and  prefix-based  indicial  objects.  Specifically,  we  assumed  that  Y  =  R„X. 
where  Rn  is  the  2"  x  2"  binary  reversal  matrix.  Recall  that  Rn^^.k.t  =  so  the 

columns  of  T  must  be  ordered  in  ascending  order  of  r(w^).  k  and  t.  .Just  as  the  development 
of  the  theory  of  prefix-based  indicial  subspaces  was  simplified  by  relating  it  to  suffix-based 
indicial  subspaces  using  Rn.  the  implementation  of  prefix-based  indicial  subspaces  enjoys  a 
similar  simplification.  In  fact,  the  algorithms  that  we  will  develop  for  suffix-based  indicial 
subspaces  carry  over  to  prefix-based  ones  with  little  or  no  change.  Henceforth,  we  shall  no 
longer  mention  the  matrix  Y .  but  use  the  equivalent  matrix  RnX. 

Coefficient  vectors  Before  proceeding  further  with  implementation  details,  we  introduce 
some  terminology  that  will  be  useful  in  the  discussion.  For  a  vector  g  €  span(5n,;).  there 
is  a  unique  representation  of  g  with  respect  to  the  basis  Sn.i  (under  the  total  ordering  <i). 
We  shall  call  this  representation  the  suffix-based  coefficient  vector  for  g.  and  denote  it  by 
g.  Note  that  g  £  and  g  =  Xg.  W’e  define  prefix-based  coefficient  vectors  similarly. 

Since  |5n,/|  =  |7^.;|  by  duality,  we  shall  identify  with  'L  The  type  of  a  coefficient 
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vector  will  be  clear  from  the  context. 


Encoding  the  total  ordering  We  now  consider  the  task  of  encoding  the  ordering.  To 
justify  our  implementation  we  describe  a  simple  approach  and  point  out  its  defects.  A 
straightforward  approach  is  to  use  the  triple  |u;|  =  /.  as  an  index.  Thus  a  vector 

g  €  'I  would  be  stored  a.s  a  3D  array  of  real  numbers  ^[0  . .  .2^  —  1][0  . .  .n][0  . .  .n  —  1]. 
We  note  that  there  are  two  serious  disadvantages  to  this  approach.  Firstly,  the  vector  g 
requires  2^n(n  +  1)  words  of  storage,  even  though  <  2^~^n(n  +  1).  More  importantly, 
not  every  triple  is  a  legitimate  index  and  so  each  access  to  the  elements  of  array 

g  requires  a  check  on  the  values  of  k  and  t.  In  particular,  the  subroutine  which  extracts 
eigenvalues  must  perform  the  check  repeatedly. 

Our  indexing  scheme  The  weakness  of  the  simple  approach  lies  in  its  noncontiguity 
of  storage,  i.e.  there  are  “gaps”  in  the  3D  array  which  are  not  used.  We  overcome  it  by 
using  an  indexing  scheme  which  maps  legitimate  triples  kj)  to  the  nonnegative  integers 
{0.1 . -  1}  using  the  total  ordering  -<;. 

Data  structures  The  indexing  scheme  is  realized  using  two  complementary  data  struc¬ 
tures:  one  mapping  triples  to  integers,  and  the  other,  integers  to  triples.  We  first  iDustrate 
the  ideas  using  the  example  ( n  =  4.  /  =  2)  given  earlier.  Note  that  for  each  of  the  four  pairs 
of  (w.A-)  having  two  values  of  t  (namely.  (00.  1).  (01.  2).  (10.  2)  and  (11.  3)).  the  value 
$(w.k.t)  -  t  is  constant.  We  can  thus  tabulate  $(*•.  A'.t)  —  f  in  a  2D  index  array  as  shown 
below. 


t’(w') 

A-  =  0 

k  =  1 

k  =  2 

A'  =  3 

k  =  4 

0(f  =  0) 

0(t=  1.2) 

2{t  =  1) 

3 (t  =  2.3) 

0 

CM 

It 

7(/  =  2.3) 

10(t  =  1) 

3 

11  (/  =  1) 

12(/  =  1.2) 

15  (/  =  0) 

So  for  a  proper  triple  (yj.k.t)  with  p|  =  2.  the  rank  $(«;./:./)  is  given  by  nideT[u(w )][/;] +  /. 

In  general,  by  Theorem  4.5.2.  for  a  fixed  of  length  /  and  a  legitimate  k.  the  values  of 
t  giving  legitimate  triples  (^.k.t)  are  “contiguous".  So  we  can  define  $  using  a  2D  array 
index[0 . .  .2‘  -  l][0...n]  (i.e.  index  has  2‘  rows  and  n  -|-  1  columns,  with  row  subscripts 
running  from  0  through  2^  -  1  and  column  subscripts  running  from  0  through  n): 

$(w’.Ar.  t)  =  mdex[u(u;)][fc]  +  t. 


or 

indei[r(u;)][fc]  =  ^{•^•.k.t)  -  i 


where  t  is  the  smallest  value  permitted  by  Theorem  4.5.2.  Note  that  for  each  a.’  €  {0.  l}^ 
not  every  k  in  the  range  (0 n}  and  1  in  the  range  (0 n  -  1}  are  legitimate.  Thus 
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the  index  array  contains  elements  which  are  actually  undefined,  (in  fact,  index  is  l)anded 
with  width  n  -  I  +  I  by  Theorem  4.5.2)  and  this  is  a  reflection  of  the  inherent  noncontiguity 
of  legitimate  triples. 

The  second  data  structure  consists  of  three  1-dimensional  arrays  ind.imf.  ind.k  and 

ind.t.  each  with  \Sn.i\  elements,  mapping  an  integer  i  £  {0 _ _  —  1}  to  the  (value  of 

the)  suffix,  the  1-bit  count  and  the  bit  transition  count  respectively  of  the  triple. 

In  addition  to  the  above  two  data  structures,  we  have  another  1-dimensional  array 
cou7i/[0 . |5n./|  —  1]  where  count[?]  gives  the  cardinality  of  the  indicial  set. 

Abstractions  for  data  structures  To  hide  the  details  of  our  data  structures  from  the 
rest  of  the  program,  we  provide  two  “converter”  subroutines.  The  first  subroutine.  get_triple. 
takes  as  input  an  integer  i  and  returns  the  triple  the  second  subroutine 

getJndex.  takes  as  input  a  triple  and  returns  The  subroutines  are  easily 

defined  using  the  data  structures. 

Initialization  of  data  structures  The  code  in  Figure  1  initializes  the  data  structures  by 
using  Theorem  4.5.2  to  enumerate  the  legitimate  triples  in  order.  The  variable  index.couni 
gives  a  running  total  of  the  number  of  triples  enumerated  so  far.  and  wiU  therefore  be  equal 
to  \Sn.i\  at  the  end. 

Advantages  of  Indexing  Scheme  We  remark  that  our  indexing  scheme  provides  a  useful 
abstraction  for  indicial  vectors.  Note  that  a  coefficient  vector  g  £  Rl>>n./I  has  two  interpre¬ 
tations:  it  can  be  regarded  simply  as  an  |t5n./|-dimensional  vector,  or  more  importantly,  it 
gives  the  coefficients  of  the  linear  combination  for  the  associated  vector  g  £  span(5n./): 

l-Sn.ll-l 

9=  21  9(i)Xu;,.k,.u-  (2) 

1=0 

Our  indexing  scheme  provides  a  clear  separation  between  these  two  interpretations.  By 
itself,  the  array  p[.  • .]  is  just  the  standard  representation  of  the  vector  g.  and  this  is  the 
view  seen  by  the  subroutines  for  extracting  the  dominant  eigenvalues  and  eigenvectors  of 
the  projection  matrices.  When  coupled  with  the  indexing  scheme,  g  can  be  regarded  as  a 
coefficient  for  g  in  (2).  Indeed,  we  find  that  for  0  <  i  <  2"  -  1. 

g(j)  =  5[$( //(/?- /-t-l)o  •■•o/i(n).K(/r).-r(|j))]  where /r  =  fT„  ( i ) 

=  g[get. index  ini  /!-/ 4- 1 )  o  •••  o  /i(  n ).«(//).  r(/i))]. 

Thus  we  can  easily  find  the  value  of  a  component  of  g  given  its  coefficient  vector  g. 

5.2  Construction  of  projection  matrices 

Data  structures  for  projection  matrices  As  was  shown  in  Sections  4. .3  and  4.4.  the 
projection  matrices  ,  and  P^i  are  sparse  with  either  2  or  4  noiizeros  per  column.  In  fact. 
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variable  index.count.  k' ,  t' ,  k.  t 

index.count  —  0 


/*  Enumerate  strings  of  length  I  with  leading  bit  0  */ 
for  £0o  {0. 1}'“^ 

k'  —  k(^('2)  0  ■  ■  ■  0  uj{l)) 
t'  —  T(^) 

ind.su /[index. count]  — 
ind.k[index  .count]  —  k' 
indJ[index. count]  —  t' 

—  index.count  —  t' 
count[index  .count]  —  1 
index.count  —  index  .count  +  1 


for  k  =  I . n  —  I 

/ndex[i’(a;)][A:  +  k']  —  index.count  -  (t'  +  1) 

for  t  =  1 . min{2k.'2(n  —  I  —  k)  +  1) 

ind.suf  [index. count]  —  r(^’) 
ind.k[index  .count]  —  k  +  k' 
ind.t[index  .count]  —  t  +  t' 
count[index. count]  — 
index.count  —  index.count  +  1 
endfor 
endfor 
endfor 


/*  Enumerate  strings  of  length  I  with  leading  bit  1  */ 
for  ^'€10  {0. 1}'"^ 
similar  code 
endfor 


Figure  1:  Code  for  initialization  of  indexing  scheme  data  structures 
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P^i  is  similar  to  (see  Appendix  A)  and  so  we  need  only  We  represent  P^]  as  a 
sequence  of  packed  columns.  Specifically,  we  have  three  arrays  col[0  . .  .d- 1].  roir[0  1], 

and  col.pToj[Q  . .  .d  —  1],  where  d  =  1  +  (r?— /)(7j— /+l)/2)  is  the  number  of  iionzeros  in 

P^i  (see  Theorem  4.5.5)  with  rou'[/]  and  co/[/]  giving  the  position  of  the  nonzero  entry 
of  P^j.  and  coLproj[i]  giving  its  value. 

Initialization  of  projection  matrix  data  structures  Theorem  4.3.1  identifies  the 
nonzero  entries  of  the  projection  matrices,  and  is  used  in  Figure  2.  which  constructs  the 
projection  matrices.  The  2D  array  A/[0  . .  .3][0 . .  .3]  holds  the  sixteen  (not  distinct)  param¬ 
eters  of  Mn  as  defined  below. 


The  variable  matsize  counts  the  number  of  nonzero  entries  computed  so  far. 

Applying  tho  column  projection  matrix  to  vectors  We  briefly  remark  on  how  we 
can  apply  on  the  left  and  on  the  right  to  vectors.  Let  g  €  The  matrix- vector 

product  P^jQ  is  formed  by  accumulating  the  linear  combination  (with  coefficients  given  by 
the  entries  of  $)  of  the  columns  of  P^;  appropriately  scattered:  the  vector-matrix  product 
g'Pn,i  is  formed  elementw'ise  by  taking  the  inner  product  of  g  with  each  column  of  P^/. 

5.3  Applying  the  transfer  matrix  to  approximate  eigenvectors 

Of  interest  in  measuring  the  quality  of  our  approximations  is  the  matrix-vector  product  Mng. 
where  q  is  an  approximation  to  a  column  eigenvector  of  from  the  subspace  span(5n./)- 
and  the  corresponding  product  .V/*p.  where  p  is  an  approximation  to  a  row  eigenvector 
from  the  dual  subspace  span(T,i,/).  In  this  section,  we  shall  consider  how  we  can  effectively 
compute  such  products  in  time  0(|«5n,/|)-  and  represent  them  in  a  manner  similar  to  the 
approximate  eigenvectors. 


Action  of  general  duodiagonal  matrix  We  first  consider  the  general  matrix- vector 
product  ['np.  g  6  span(*5„,;).  The  key  to  an  efficient  computation  of  Unp  is  Corollary  4.2.4. 
which  describes  the  action  of  f'„  on  a  basis  vector  in  S^.i-  Let  g  €  be  the  coefficient 

vector  of  g.  i.e. 

9=  Y. 

(=0 


Then 


n9  ~  ^  ■  9(^)^  n^w,.k,,t, 

!=0 


is  a  linear  combination  of  images  of  basis  vectors  in  Sn.i  under  the  action  of  By 
Corollary  4.2.4.  each  such  image  is  it.self  a  linear  combination  of  vectors  in  Thus 
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variable  matsize.  i.  k.  t.  k\  f,  rowJndex 
matsize  —  0 

for  i  =  0 . j5„,/|  -  1 

{uj.k.t)  —  getjriple(i) 

determine  the  leading  bit  g  of  strings  in  E  = 

!*  Compute  nonzeros  associated  with  E°  */ 

2/(E°  is  nonempty) 

determine  k'  and  t'  such  that  2E°  = 
col[matsize]  —  i 
rowJndex  —  getJndexi'ljj.k'.t') 
row[nnatsize]  ■  wJndex 

coLproj[m  >.  -cj  —  |E®|  ♦  jV/[2t'(^(/))][u(/2  o  0)]  /  count[7-ou'.mdea:] 
matsize  — matsize  +  1 

determine  k'  and  t'  such  that  2E°  +  1  = 

■.:ol[matsize]  —  i 

rowJndex  —  getjndex('2^+l,k\t') 
row[matsize]  —  rowJndex 

coLproj[m.atsize]  —  |E°|  *  M[2v(>jj{1}}  +  l][n(//  oO)]  /  counf[rowJndex] 
matsize  — matsize  +  1 
endif 

/*  Compute  nonzeros  associated  with  E^  */ 
similar  code 
endfor 


Figure  2:  Code  for  initializing  the  column  projection  matrix 
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n9  €  span(5n,/+i  )•  and  we  can  compute  it  by  building  up  the  coefficients  for  each  basis 
vector  of  . 


Data  structure  for  image  vectors  As  in  the  representation  of  vectors  in  span(».^„./ ).  we 
impose  a  total  ordering  -</+i  on  the  legitimate  triples  (fi.k.t)  with  |/x|  =  /  +  1.  and  encode 
the  ordering  with  similar  data  structures  and  converter  subroutines.  To  distinguish  them 
from  the  subroutines  for  the  ordering  -<i.  the  subroutine  converting  legitimate  triples  (p.  k.  t ) 
(|/i|  =  /  +  1)  to  their  ranking  under  the  ordering  will  be  called  get  Jndex.prod.  and 
the  subroutine  returning  the  legitimate  triple  wiU  be  called  get.triple.prod.  In  addition, 
the  array  count. prod[. . .]  wiU  store  the  cardinalites  of  the  indicial  sets  |p|  =  /  +  1.  .4 

vector  h  £  span(5n,/+i)  will  be  represented  by  its  coefficient  vector  h  6 

Code  for  computing  the  matrix- vector  product  Figure  .3  gives  the  code  for  the 
subroutine  apply_matrix(5,  h )  which  takes  as  input  the  coefficient  vector  g  for  the  vector 
g  e  span(5„.().  and  returns  the  coefficient  vector  h  for  the  matrix- vector  product  Mn9-  Tlie 
array  .V/[. . .][. . .]  holds  the  sixteen  parameters  of  M„  and  was  defined  in  Section  5.2. 

Applying  the  adjoint  of  the  transfer  matrix  We  consider  the  dual  matrix-product 
*1/^5  for  a  vector  g  £  span( Tn./).  whose  coefficient  vector  is  g.  Let  A'/  and  A'/+i  be  the  basis 
matrices  associated  with  the  bases  Sn.i  and  Sn.i+i  respectively.  Then 

Rn-Mn9  =  {Rn^^nRn}-^ld  €  Span(5n.<+l) 
since  RnM'Rn  is  duodiagonal.  and  so 

M^g  £  i2„span(5„./+i )  =  span(Tn,/+i  )• 

The  coefficient  vector  h  of  M~g  is  defined  by 

=  M:g  =  M^Xig 


and  so 

A'/+l^  =  (RnM^Rn)Xig. 

i.e.  h  is  the  coefficient  vector  of  the  image  of  Xig  £  spanl^S^./)  under  the  action  of  the 
duodiagonal  matrix  Rr,M‘Rn.  The  code  for  the  corresponding  subroutine  apply.transpose 
is  therefore  identical  to  that  of  apply  .matrix,  except  that  the  array  RM  R[. ..][...].  which 
holds  the  sixteen  parameters  of 


is  u.sed  in  place  of  A/[. . .][. .  .] 
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subroutine  apply-matrix(g.h) 
output  h 

variable  i.  pi.  w,  k.  t.  k' .  t' .  imageJndex 
for  i  =  0 . i*5n./+il  -  1 

h[i]  -  0.0 

for  i  =  0 . l«Sn.;l  -  1 

i^-.k.t)  —  g€tJriple(i) 

determine  the  leading  bit  pi  of  strings  in  E  = 

/*  Compute  contributions  from  E°  */ 
if  (E°  is  nonempty) 

determine  k'  and  t'  such  that  2E^  =  ^^oO.k'.t' 
image.index  —  getJndex{^'o0.k.t  ) 

h[image. index]  —  h[image -index]  +  M[2v(.^(l))][v(pi  o  0)]  *  5[/] 

determine  k'  and  t'  such  that  2E°  +  1  =  'i-woi.k'.t' 
image-index  —  get-index(^’  o  i.k  .t  ) 

h[image-index]  —  h[imag e -index] M[2v{^{l))  °  0)]  *  5[*] 

endif 

/*  Compute  contributions  from  E^  */ 
similar  code 
endfor 


Figure  3:  Code  for  applying  the  transfer  matrix 
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5.4  Computation  of  inner  products 

\arious  quantities  of  iiiteresi  derived  from  subspace  approximations  require  the  computation 
of  inner  products,  eg.  norm  of  a  vector,  angle  between  two  vectors,  and  the  generalized 
Rayleigh  quotient.  In  our  case,  the  vectors  could  come  from  four  different  subspaces  of  R‘”; 
spanf^rt.;)  and  span(T„.;)  containing  appro.ximate  column  and  row  eigenvectors  respectively: 
and  span(c>„,/+i )  and  spaiifT^  z+i )  containing  the  images  of  these  approximations  under  the 
action  of  the  transfer  matrix  Mn  and  its  adjoint  M~  respectively.  Our  goal  in  this  section 
is  to  describe  efficient  methods  for  computing  the  inner  products. 

Types  of  inner  products  Let  gi.go  €  be  coefficient  vectors  for  vectors  in  either 

span(5„.;)  or  span(7ji,;).  and  let  hi-h^  €  R'^"  '+d  be  coefficient  vectors  for  vectors  in  either 
span!  )  and  span(T„,;+i ).  Depending  on  the  combination  of  the  subspaces  containing 

the  two  vectors  whose  inner  product  we  wish  to  compute,  we  have  one  of  the  ten  subroutines 
below.  Here  A'/  and  A’/+i  are  the  basis  matrices  associated  with  the  bases  Sn.i  and 
respectively. 

(a)  coLcolJp(5i.52)  =  A'/52) 

-  inner  product  of  two  approximate  column  eigenvectors 

(b)  row  .row  Jp(  51.^12)  =  (^RnX'igi.  R^Xih) 

-  inner  product  of  two  approximate  row  eigenvectors 

Ic)  coLrowJpl 51.52)  =  (.V/5i.  .R„A‘;52) 

-  inner  product  of  an  approximate  column  eigenvector  and  an  approximate  row  eigen¬ 
vector 

(d)  colp.colpJp(^i.h2 )  =  A'z+i^z) 

-  inner  product  of  images  of  two  approximate  column  eigenvectors 

fe)  rowpjowpJp(7ii .  7)2 )  =  A'/+i ^1  • -Rn-Yz+i 

-  inner  product  of  images  of  two  approximate  row  eigenvectors 

(f)  colp.rowpJp(7i.72)  =  (^Xi+ihi.  R„Xi+ih2^ 

-  inner  product  of  images  of  an  approximate  column  eigenvector  and  an  appro.ximate 
row  eigenvector 

(g)  coLcolpJp(5i.7i )  =  (|A'/5i.  AVi^i) 

-  inner  product  of  an  approximate  column  eigenvector  and  image  of  an  appro.ximate 
column  eigenvector 

(h)  rowj-owpJp(5i.hi )  =  {^RnA'/pi .  .R„A'/+i/ii^ 

-  inner  product  of  an  approximate  row  eigenvector  and  image  of  an  approximate  row 
eigenvector 
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(i)  coLro\vpip(gi./)i)  =  (^Xig-i.RnXi+ihi'^ 

-  inner  product  of  an  approximate  column  eigenvector  and  image  of  an  approximate 
row  eigenvector 

(j)  row_colpJp(5i.*2)  =  A'/+i^i) 

-  inner  product  of  an  approximate  row  eigenvector  and  image  of  an  approximate 
column  eigenvector 

The  reader  should  note  that  the  arguments  to  the  subroutines  are  coefficient  vectors,  i.e. 
the  inner  product  of  two  vectors  is  computed  from  their  associated  coefficient  vectors. 

It  is  easily  verified  (using  the  orthogonality  property  R’Rn  =  /„)  that  subroutines  (a) 
and  (b).  (d)  and  (e),  (g)  and  (h),  and  (i)  and  (j)  are  identical: 

•  rowj-owJp(5i,g2)  =  coLcoUp(  51,^2) 

•  rowp_rowpJp(/?i./22)  =  colp.colp  Jp( /ii . /i2 ) 

•  rowjrowpJp(5i,^i )  =  coLcolpJp(^i. ) 

•  row_colpJp(5i./ii )  =  col_rowpip(5i.hi) 

Thus  we  have  six  different  types  of  inner  products  to  consider. 

5.4.1  coLcoLip(5i.52) 

We  begin  by  considering  the  simplest  inner  product,  that  involving  two  vectors  gi.g2  € 
span(5n./).  whose  coefficient  vectors  are  gi  and  §2  respectively,  i.e.  g\  =  A'/^i-  </2  =  Xig2- 

Simple  approach  The  straightforward  approach  to  computing  the  inner  product  of  g^ 
and  g2  is  to  accumulate  it  elementwise.  We  saw  in  Section  5.1  how  we  can  find  the  value  of 
any  component  of  gi  (or  52)  from  its  coefficient  vector  ^1  (or  ^2)-  Thus  we  can  evaluate 

2"-l 

{91-92)  =  ^  9i(i)92ii) 

1=0 

by  accessing  the  appropriate  entries  of  gi  and  52-  This  simple  approach  has  one  serious 
flaw:  it  takes  time  0(2")  even  though  gi  and  ^2  each  have  at  most  distinct 

entries. 

Efficient  computation  The  key  to  an  efficient  computation  of  (31.92)  (and  the  other 
inner  products  that  we  shall  consider)  is  to  choose  an  appropriate  basis  and  to  express  the 
computation  in  terms  of  that  basis.  The  simple  approach  (implicity)  uses  the  standard 
basis:  the  efficient  approach  uses  the  “natural”  basis  for  31  and  32.  namely  Sn.i-  Indeed. 

(31-92)  =  (A'/9i..\'/32) 
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ftiibmutine  col.col-ipig1.g2) 
variable  i.  ip 
ip  —  0 

for  i  =  0 - -  KV/I  -  1 

ip  —  ip  -f-  count[i]  *  5i[i]  *  52(0 
endfor 
return  ip 

Figure  4:  Code  for  subroutine  coLcolJp(pi.52) 

subroutine  colp.colp.ip(  .  /12 ) 
variable  i.  ip 
ip  —  0 

for  i  =  0 . |<S„,/+i|  -  1 

ip  —  ip+  count.prod[i]  *  *  ^2[*] 

endfor 
return  ip 

Figure  5;  Code  for  subroutine  colp.colpJp(pi,p2) 

=  9i^’^ld2 

=  Yi  9iii)d2(i)\lw,.k,.t,\  (3) 

t=0 

since 

A/ A/  ■=  d2a5f( II  ,  ll^r^j .fcj .ti II  ■  ••■) 

where  (u;,.k,.ti)  is  the  i*'^  triple  under  the  total  ordering  -<1  described  in  Section  5.1.  The 
sum  (3)  takes  time  0{  |»Sn./| )  and  translates  easily  into  the  code  in  Figure  4  for  the  subroutine 
coi.colJp. 

5.4.2  colp_colp_ip(hi.h2) 

This  inner  product  is  similar  to  the  previous  one.  except  that  the  vectors  now  come  from 
span(5„.;+i)  and  so  the  appropriate  basis  to  use  is  Sn.i+i-  We  remind  the  reader  that 
count.prod[i]  gives  the  cardinality  of  the  nonempty  indicia!  set  with  |k^’|  =  /+1  under 
the  ordering  -<;+i  (see  Section  5.3).  The  code  in  Figure  5  clearly  takes  time  0{Sn.i+i)  = 
0{2‘n'^ ). 
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5.4.3  coLcolp_ip(5. /i) 

This  inner  product  involves  two  vectors  from  different  subspaces:  g  £  span(»f>n,/).  whose 
coefficient  vector  is  g.  and  h  £  span(<S„.;+i ),  whose  coefficient  vector  is  h.  i.e.  g  =  Xtg. 
h  =  Xi+ih.  We  remind  the  reader  that  and  Sn.i+i  consist  of  basis  vectors  associated 
with  nonempty  indicial  sets  with  pj  =  /  and  jcj(  =  1+1  respectively.  Thus  5„./+i 
is  a  "refinement"  of  S„j.  and  span(<S„.;)  C  span(5„./+i ).  So  the  natural  basis  in  which  to 
express  the  inner  product  {g,h)  =  is 

The  main  task  is  to  represent  the  vector  g  £  span(5nj)  with  respect  to  the  basis  Sn.i+i- 
As  usual,  we  let  with  lijjj  =  /  be  the  legitimate  triple  under  the  ordering  +i. 

Then  bv  definition. 

(•Sn,,!-! 

5=  XI  (d) 

j=o 

Let  (pi.  A-'.t')  with  \g.i\  =  /  +  1  denote  the  legitimate  triple  under  the  ordering  ^/+i.  We 
seek  the  coefficient  vector  g  £  satisfying: 

I'^n.l+l  1  —  1 

ff=  XI  (5) 

1=0 

Consider  the  basis  vector  appearing  in  (5).  Since  is  a  refinement  of  Snj.  the 

coefficient  g{i)  is  equal  to  g(j}  where  j  satisfies: 

=  (pi(2)o  ...ofii(l.  +  1). A'./'). 

In  terms  of  our  inde.xing  functions. 

j  =  getJndex(/ri(2)o---o/i,(/  +  l).A:-.t-). 

Having  determined  the  coefficient  vector  of  g  with  respect  to  the  basis  Sn.t+\.  we  can 
compute  the  inner  product  of  g  and  A  in  a  manner  similar  to  the  previous  cases.  The  code 
is  shown  in  Figure  6  and  takes  time  0{Sn.i+i )  =  0(2't7^). 

5.4.4  coLrowJp(gi.^2) 

This  inner  product  is  needed  for  the  error  estimates.  It  involves  two  vectors  from  dual  sub¬ 
spaces:  gi  £  span(5n./).  whose  coefficient  vector  is  gx.  and  p2  €  span(7^.;).  whose  coefficient 
vector  is  §2-  he.  gi  =  A';pi.  p2  =  RnXig2-  This  inner  product  is  more  difficult  to  compute 
than  any  of  the  previous  three  because  the  two  subspares  are  not  related  in  a  trivial  way. 
i.e.  they  do  not  satisfy  any  containment  relations.  There  are  two  very  different  cases  to 
consider:  21  +  1  >  n  and  2/  +  1  <  n. 


Case  I:  2/  -I-  1  >  r? 
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subwutint  coLcolpJp(g.  h ) 
variable  i,  ip.  p.  k.  t 
ip  —  0 

for  i  =  0 . |*S„.;+i  I  -  1 

ip.k.t)  —  get. triple  jprod(i) 
ip  —  ip  +  count. prod[i]  *  /?[?']  * 

g[get.index{p{2]  q  ■  ■  ■  o  p{l  +  1).  fc,<)] 

endfor 
return  ip 


Figure  6:  Code  for  subroutine  coLcoIp_ip(p./i) 

We  shall  show  that  in  the  inner  product 

2**  — 1 

{91-92)  =  ^  9i(i)92{i)-  (6) 

1=0 

the  products  £fi(i)</2(j)  (0  <  /  <  2’^  —  1)  could  all  be  distinct,  and  so  there  is  no  better  way 
to  compute  the  inner  product  other  than  the  sum  in  (6). 

Here  is  the  reason. 

Lemma  5.4.1  For  0  <  i  ^  j  <  2^  -  1.  there  exist  g\  and  g2  such  that  9\{i)g2{i]  # 
9\{j)92{j)- 

Proof.  Since  2/  +  1  >  n.  a  string  p  of  length  n  is  uniquely  determined  from  its  /-bit  suffi,\.. 
its  /-bit  prefix,  and  its  1-bit  count.  Thus  (7n(i)  and  crn{j)  either  have  different  /-bit  suffixes, 
different  /-bit  prefixes,  or  different  1-bit  count.  So  <T„(t)  and  an(j)  cannot  both  be  in  the 
same  suffix-based  indicia!  set  (with  =  /)  and  in  the  same  prefix-based  indicia!  set 

(with  PI  =  /).  Therefore,  and  giij)  are  not  constrained  to  be  equal,  and  neither 
are  gjii)  and  g^ij).  and  there  exists  and  g2  such  that  gi{i)g2(i)  ^  g\{i)92{i]-  O 

Equivalently,  the  smallest  subspace  of  containing  both  span(5„,/)  and  span(7^./)  is 
itself,  but  the  proof  is  not  trivial. 

The  code  in  Figure  7  computes  the  inner  product  for  the  case  2/  -(-  1  >  n  in  time  0(2"). 
We  remark  that  we  do  not  expect  this  case  (i.e.  I  >  (n  —  l)/2)  to  occur  often  as  we  would 
not  typically  have  /  that  large. 


Case  11:  2/  -(-  1  <  n 

We  seek  a  common  basis  in  which  to  express  g\  =  A'/pi  €  span(«S„,/)  and  g2  =  RnXig2  € 
span(7^,/).  Since  the  basis  vectors  in  Sn.i  and  T^j  are  associated  with  suffix-based  and 
prefix-based  indicia]  sets  respectively,  the  common  basis  should  have  vectors  associated 
with  more  general  indicia!  sets  involving  both  suffixes  and  prefixes. 
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•subroutine  colj'ou'.ip[g\.g2 ) 
variable  i.  ip,  u;,  k.  t 
ip  —  0 

for  i  =  0 _ _  2"  -  1 

—  CTn(0 
k  =  K{a;) 

t  =  r(w-) 

ip  —  ip  +  gi[getJnd€x{yj(v  —  l  +  l)  o  •  -  •  o ij(n). fc. /  )]  * 
g2[getJndex(yj{l)  o  ■  ■  ■  o  ^(l).  k,t)] 

endfor 
return  ip 


Figure  7:  Code  for  subroutine  coLrowJp(5i.^2)-  case  I 


General  indicial  sets  Formally,  for  0  <  A;  <  w.  0  <  <  <  n  —  1  and  p\,p2  €  {0.  l)^  we 
define  a  general  indicial  set  to  be 


fc.t  •=  {f  €  {0.  l}"  :  «(//  )  =  k.  t(p)  =  t.  p\  is  a  suffix  of  p  and  p2  is  a  prefix  of  p}. 

In  analogous  fashion  to  suffix-based  indicial  sets,  we  can  define  general  indicial  vectors 
-"2  Ml  kt  general  indicial  basis  For  a  complete  discussion  of  general  indicial 

sets,  see  Sections  2.6.  2.7  and  Appendix  A  of  [Hen91]. 

Note  that  for  0  <  /r  <  n.  0  <  f  <  n  -  1.  €  {0.  l}^ 


i: 

M2€{0.1}' 


and 


MiSfO.l}' 

=  H  '..•«.( Ml 

~  y  ^  ~u.-^,Mi  .fc.f 

mi€{0.1}' 

So  the  coefficient  of  ,k.t  in  the  representation  of  gi  with  respect  to  the  basis  Vn.i.i  is  gi(i) 
for  the  /'■*'  triple  (^,.k,.ti)  =  (pi.k.t):  and  the  coefficient  of  :^2,^i-k.t  in  the  representation 
of  52  with  respect  to  V'n.;,;  is  52(7)  for  the  triple  (wwj. fcj. )  =  ((p2)^.  k.t).  Since  the 
basis  is  orthogonal,  the  inner  product  (51.52)  is  the  summand  in  (3)  adapted  to  the 
basis  Vn,i,i- 
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Enumeration  of  general  indicial  sets  To  accumulate  the  sum.  we  need  to  enumerate 
the  basis  or  equivalently,  the  nonempty  general  indicia!  sets.  This  is  accomplished 

through  the  many-to-one  correspondence  below. 

Lemma  5.4.2  Let  Q  be  the  collection  of  nonempty  general  indicial  sets  ~ 

1^2!  =  1)  andl  be  the  collection  of  nonempty  1-bit  suffix-based  indicial  sets  ( |^’|  =  1  )■ 

The  function  f  :  Q  —  J  given  by 


/(G” 


\  ^  TTl— 2/+2 

.k,ti  * 


where 


k'  =  fc  -  k(^2(1)  •  o  ;t.2(/ -  1))  -  k(//i(2)  0  •  -  •  0  ^2(0). 

and  t'  =  t  —  T[p,2)  —  TiP'i)- 

preserves  cardinalities  and  gives  a  2^^~^-to-one  correspondence  between  Q  and  J . 

Proof.  For  a  string  p  €  G"^  dropping  its  leading  I  —  1  and  trailing  /-  1  bits  transforms 
it  into  a  string  in  In  addition,  each  string  in  is  uniquely  obtained  from 

a  string  p  €  G"^  ^  in  such  a  manner.  Thus  /  preserves  cardinalities.  Let  €  I. 

By  Proposition  4.1.4.  the  strings  in  have  a  common  leading  bit.  which  we  shall  call 

p.  It  is  easily  verified  that 

p  =  P2(ii  =  piiiu 

k  =  k'  +  k(p2(1)o  •  ■  •  0  p2(l  -  1))  +  k(pi{2)  0  ■  ■  ■  0  p2{l)). 
and  f  =  f'  +  t(p2)  +  T(pi )} 

and  so  /  gives  a  2*^~^-to-one  correspondence.  □ 

The  code  in  Figure  8  uses  Lemma  5.4.2  to  enumerate  the  nonempty  general  indicial  sets. 
It  is  clear  that  the  code  takes  time 

0(22'-2|5„_2/+2.i1)  =  0{2^^-^(2  +  (n  -  21  +  2)(n  -  21  +  m 
=  0{2^‘-^n^). 


5.4.5  colp_rowpJp(/ii./!2) 

This  inner  product  involves  hi  €  span(«5n./+i )  and  /?2  €  span(7^,/+i ).  whose  coefficient 
vectors  are  hi  and  ^2  respectively.  It  is  similar  to  the  previous  inner  product  coLrowJp. 
with  ca,se  I  now  occuring  when  2(  /  +  1)  4-  1  >  n  and  case  II  when  2(  /  +  1)  4-  1  <  n.  For 
case  II.  the  general  indicial  sets  involved  have  suffixes  and  prefixes  of  length  /  4-  1.  and  the 
"generating"’  1-bit  suffix- based  indicial  sets  are  the  nonempty  =  1-  The  code  for 

coLrow  jp  carries  over  mutatis  mutandis. 
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subrout i ne  col _r o U' J p( .  92 ) 

variable  i.  ip.  p.  p^.  p^.  k.  t.  k\  t\  cnt,  indexl.  index2 
ip  —  0 

/*  Enumerate  general  indicial  sets  associated  with  I©  00^^  * ! 

/or/ii  €  {0.1}'-^ 

Jorp2  €  {0.1}'"^ 

k  —  n{p2)  +  K(//i) 

^  ~  '’■(A'2)  +  -  1)  0  0)  +  T-IO  o  +  r(//i ) 

indexl  —  getJndex(Oopi.k.t) 
index2  —  get.index{{p2  o  Q)^.k.t) 
ip  —  ip  +  [indexl]  *  g2[indtx2] 
endfor 
endfor 

/*  Enumerate  nonempty  Iq k  >  0  */ 

for  k  =  I . n  -  2/  +  1 

for  1  =  1 . m.in(2/c.2(n  _  2/  +  1  -  k)  +  1) 

if  (t  is  even)  p  —  0  else  p  —  i 

I*  enumerate  associated  general  indicial  sets  */ 

for  p^  € 

for  p2  €  {0. 1}'"' 

k  —  k(p2  )  +  k  K{pi ) 

f  -  t(p2)  +  t(p2(1~  l)op)  +  f'  +  r(0opi(l))  +  r(/ii) 
indexl  —  getJndexiO  0  pi.k.t) 
index2  —  get.ind€x((p2  0  p)^.k.t) 
ip  —  ip  +  cnt  *  Pi  [indexl]  ♦  p2[iw<^€x2] 
endfor 
endfor 
endfor 
endfor 

/*  Enumerate  nonempty  Ii  k<n-2l  +  2*/ 
similar  to  code  for  nonempty  k  >  0 

/*  Enumerate  general  indicial  sets  associated  with  Ii'[]i^J+2  0  */ 
similar  to  code  for  IJo  */ 

return  ip 


Figure  8;  Code  for  subroutine  coLrowJp(pi.p2)-  case  II 
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5.4.6  coLrowpJp(5. 

.•\.s  in  Section  5.4.5.  this  inner  product  is  similar  to  coLrowJp.  The  vectors  g  and  h  belong 
to  the  subspaces  span(5„./)  and  span(7^./+i ).  and  their  coefficient  vectors  are  g  and  h 
respectively.  Case  I  occurs  when  /  +  ( /  +  1 )  +  1  >  »  and  case  II  when  /  +  (/  +  l)+l  <  ji-  For 
case  II.  the  general  indicial  sets  involved  have  suffixes  of  length  /  and  prefixes  of  length  /-f  1. 
and  the  “generating”  1-bit  suffix-based  indicial  sets  are  the  nonempty  .  l-^i  =  1.  The 

code  for  coLrowJp  carries  over  mutatis  mutandis. 

5.5  Computation  of  residual  norms 

.Associated  with  an  approximate  eigentriple  (Tt.q.p)  are  the  residuals  Af„g  — and  M^p~-p 
for  the  approximate  column  eigenvector  q  €  span(»f>„./)  and  the  appro.ximate  row  eigenvector 
p  €  span(Tn.i)  respectively.  Since  M„q  €  span(<5n./+i )  ^  span(T„,;+i ).  and  ~q  € 

span(5n./)  and  rrp  c  span(7^./).  we  shall  therefore  consider  the  more  general  residual  norm 
subroutines  below,  where  g  €  is  a  coefficient  vector  for  a  vector  in  either  span(vSn.;) 

or  span(T^./).  and  h  £  Rl*5n.(+il  jg  ^  coefficient  vector  for  a  vector  in  either  span(5„./4.i )  or 
span(Tn,/+i ): 

(a)  coLresidualftf./i)  =  ||.V/^  -  .V/+i5|| 

-  norm  of  the  difference  between  an  approximate  column  eigenvector  and  its  image 

(b)  rowj-esidual(5.5  )  =  WR^Xig  -  R„Xi+ih\\ 

-  norm  of  the  difference  between  an  appro,xjmate  row  eigenvector  and  its  image 

We  remind  the  reader  that  .V/  and  A'/+i  are  the  basis  matrices  for  Sn.i  and  Sn.i+i  respectively. 
Since  multiplication  by  orthogonal  matrices  preserves  the  spectral  norm. 

row_residual(§.  h)  =  \\R„Xig  -  Rn 

=  llAbp  -  A'/+iMl 

=  coLresidual(  g.h), 

and  so  we  need  only  to  consider  the  subroutine  col_residuaI. 

The  subroutine  col_residual  is  very  similar  to  the  subroutine  coLcolpJp.  Both  take  as 
inputs  coefficent  vectors  g  and  h  for  vectors  g  £  span(*?„.;)  and  h  £  span(7^,/)  respectively: 
the  inner  product  of  g  and  h  is  the  sum  of  the  product  of  corresponding  entries,  while 
iip  “  ^11^  's  the  sum  of  the  squared  differences  of  corresponding  entries.  The  code  for 
coLcolpJp  carries  over  with  multiphcation  replaced  by  taking  the  square  of  the  difference, 
and  is  shown  in  Figure  9.  The  subroutine  colj’esidual  therefore  has  the  same  order  of 
running  time  as  that  of  coLcolpjp.  namely  C{2‘n^). 
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subroutine  coLcolpJp{g.h) 
variable  i.  ip,  p.  k.  t 
ip  —  0 

for  i  =  0 . -  1 

(p.k.t)  —  get. triple. prod(i) 

ip  —  ip  +  count. pTod[i]  *  -  g[gei.index{p{2)  o  ■  ■  ■  o  p(}  \).k.t)]) 

endfor 
return  ip 

Figure  9;  Code  for  subroutine  coI_residuaI(5.^ ) 


6  Extracting  Information  from  the  Projections 


The  method  of  minimal  representations  requires  an  efficient  procedure  for  calculating  the 
Perron  root  and  the  dominant  (positive)  eigenvector  of  The  technical  question  is 

whether  there  are  any  algorithms  good  enough  to  make  each  cycle  (/  —  /  +  1).  in  particular 
the  final  one.  of  tolerable  duration.  Since  there  are  at  most  four  nonzeros  per  row  and 
l‘^’n./|  ^  rows  the  formation  of  P^iX  requires  4iP2^~^  multiplications  and  the  same 
number  of  additions.  It  is  customary,  these  days,  to  ignore  the  differences  between  +.  -.  *. 
/  and  to  estimate  simply  flops  (floating  point  operations).  Thus  x  —  Pni^  needs 
flops.  With  no  difficulty  we  can  apply  this  matrix-vector  product  m  times  and  so  work  with 
the  operator  (P^;)"*  to  obtain  more  rapid  convergence. 

As  the  temperature  parameter  in  the  Ising  model  approaches  a  critical  (phase  change) 
value  the  two  largest  eigenvalues  of  approach  the  same  limit  and.  unfortunately  the 
two  dominant  eigenvectors  also  converge.  In  order  to  be  able  to  take  this  situation  in  its 
stride,  without  serious  degradation  in  performance,  our  algorithm  is  required  to  produce,  in 
all  cases,  the  two  largest  (positive)  eigenvalues  and  the  right  and  left  eigenvectors  for  both 
of  them. 

The  simplest  candidiate  is  the  block  Power  Method  with  a  block  size  of  two.  We  have 
implemented  it  but  found  that  it  lags  further  and  further  behind  the  Lanczos  algorithm  as  I 
increases.  We  will  not  describe  that  method  here.  Some  of  our  readers  may  not  be  familiar 
with  the  Lanczos  algorithm  but  we  do  not  wish  to  digress  into  a  detailed  exposition  of  it 
here.  See  [M'iloe].  [GvL89]  or  [PTL80]  for  that.  What  follows  is  a  high  level  commentary 
on  our  use  of  the  Lanczos  algorithm. 

Imagine  that  k  steps  of  the  Power  Method  are  taken  and  each  computed  vector  is  saved. 
Imagine  the  best  approximation  y  to  the  dominant  eigenvector  that  can  be  made  by  taking 
an  appropriate  linear  combination  of  all  A:  vectors.  In  principle  the  Lanczos  algorithm 
computes  y  without  saving  all  k  vectors.  This  last  statement  is  not  strictly  correct  when 
the  linear  operator  (or  matrix)  that  generates  the  power  vectors  is  not  self  adjoint  but  it 
gives  an  idea  of  the  strength  of  the  method.  In  particular  one  can  see  that  the  method  is 
not  strictly  iterative  because  when  k  equals  the  order  of  the  matrix,  if  not  before,  then  y  is 
the  Perron  vector. 

What  the  Lanczos  process  does  for  a  nonsymmetric  matrix  is  the  following.  It  takes  two 
starting  vectors  u\  and  w'hich  must  be  supplied  by  the  user.  By  the  end  of  Step  j  it  has 
computed  four  matrices,  each  with  j  columns: 

f'j  =  («i . w_,).||«,j|  =  1./  =  1 . j. 

'  J  =  ('^’1 . fj)- 

Tj  is  tridiagonal  (entries  given  later),  and 

fij  =  diag(^j\ . 
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together  with  two  extra  vectors  Tj  and  Sj.  In  exact  arithmetic,  if  no  breakdown  occurs. 

Moreover,  by  step  j.  the  vectors  ui, _ Uj_2  and  ui, _ ij_2  are  no  longer  needed  and  are 

discarded. 

.4t  this  point  there  must  be  a  test  to  see  whether  another  step  is  needed.  In  our 
apphcation  we  compute  the  two  largest  eigenvalues  of  the  pair 

Tj5,  =  /=1.2. 

If  di  and  ^2  have  settled  down  to  the  required  accuracy  (we  demand  10  or  12  decimal  figures 
in  agreement  with  6^  and  62  s^t  the  previous  value  of  j)  and  both  are  positive  then  we  stop 
and  compute  gi  and  52-  otherwise  the  Lanczos  process  takes  another  step. 

We  do  not  need  to  describe  what  goes  on  in  one  step.  It  suffices  to  note  that  its  overhead 
(beyond  the  matrix- vector  product)  is  42%  of  the  overhead  of  the  block  power  method  and 
l.S  times  the  overhead  of  the  simple  power  method. 

The  matrix  is  an  oblique  projection  of  P^i  and  the  Lanczos  method  needs  an 

auxiliary  procedure  to  compute  the  wanted  eigenvalues  of  a  tridiagonal  matrix.  Thus  all 
Lanczos  is  doing  is  to  provide  a  tridiagonal  approximation  to  We  use  a  modified  Newton 
iteration  that  consumes  only  2.5%  of  the  total  time  required  by  the  Lanczos  process.  We 
say  a  little  more  about  it  later. 

Let  us  suppose  that  the  step  did  pass  the  test  and  so  and  62  are  accepted  as  the 
largest  two  eigenvalues  of  P^j.  What  remains  to  be  done?  First  one  computes  g-i  and  ^2- 
the  eigenvectors  of  (Tj.Qj).  They  provide  the  coefficients  for  the  approximate  eigenvectors 

J  J 

di  =  X^u.5i(/)  and  h  =  Yl 
1=1  1=1 

However  we  have  discarded  the  earlier  u,'s  and  u,'s  and  cannot  form  these  sums.  So  we 
run  the  Lanczos  process  again,  with  the  same  starting  vectors,  and  this  time  through  we 
accumulate  the  two  sums  before  discarding  each  Lanczos  vector.  Remember  that  these 
vectors  are  of  dimension  not  2".  This  yields  g^  and  52-  Since  P^i  is  similar  to  P^i 

we  can  compute  the  approximate  row  eigenvectors  at  little  extra  cost  from  {ui . r^  }.  We 

need  these  extra  vectors  for  our  error  estimates. 

There  are  several  aspects  that  we  have  glos.sed  over. 

1.  Occasionally  one  of  the  wt.’  values  comes  too  close  to  zero  and  we  then  abort  the  Lanczos 

run  as  a  failure,  pick  new  starting  vectors  and  restart.  So  far  the  second  run  has 
always  succeeded. 

2.  Newton's  method  applied  to  the  characteristic  polynomial  of  our  tridiagonal  is  guaran¬ 

teed  to  converge  from  a  starting  value  greater  than  the  dominant  eigenvalue  but  it 
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can.  in  general,  be  dreadfully  slow  if  the  starting  value  is  not  quite  close  to  the  target. 
Since  we  know  the  entries  of  we  compute  ||P^;||oc  initially  and  start  there.  This 
is  the  right  choice  for  small  values  of  j  but  is  too  cautious  in  the  later  stages  when 
has  settled  down  to  two  or  three  decimals. 

3.  An  unpublished  result  of  W.  Kahan  which  shows  that  the  iteration  that  doubles  the 

Newton  correction  can  never  jump  over  the  largest  stationary  point.  So  we  double 
the  Newton  correction  until  there  is  a  sign  reversal  in  the  Newton  correction. 

4.  There  is  an  algorithm  for  evaluating  the  Newton  correction  nicely  when  the  matrix  is 

tridiagonal  [PN085].  It  avoids  the  rescaling  feature  that  is  the  curse  of  the  three  term 
recurrence  for  the  characteristic  polynomial  and  its  derivative. 

In  closing  'ct  us  return  to  the  big  picture.  Our  Lanczos  code  could  be  improved  in  a  few 
ways  should  the  need  arise.  Most  of  the  Ising  model  computation  is  spent  in  this  section. 
We  have  made  a  few  experiments  with  m.  the  power  of  to  be  used  in  Lanczos.  The 
choice  m  =  2  is  much  better  than  m  =  1  but.  for  reasons  we  do  not  fuUy  understand  .3  and 
4  are  better  than  5  or  6  in  reducing  the  flop  count  for  the  Lanczos  run.  We  cannot  give  a 
simple  expression  for  the  cost  of  computing  the  largest  eigenvalue  of  P^^  together  with  its 
eigenvector  because  the  cost  depends  quite  strongly  on  the  temperature  of  the  Ising  model 
and  we  have  concentrated  our  values  near  the  critical  one.  Our  algorithm  appears  to  be 
linear  in  |»S„.;|.  the  order  of  P^^.  and  that  is  an  important  factor  in  the  usefulness  of  our 
approach . 

We  have  made  a  few  experiments  with  the  choice  of  starting  vectors  u-[  and  tq  in  the 
Lanczos  method.  Using  eigenvectors  from  the  previous  value  of  I  offers  some  improvement 
over  random  starting  vectors. 


48 


7  Error  Estimates 


An  essential  ingredient  in  the  method  of  minimal  representation  is  a  reliable  estimate  of 
the  accuracy  of  the  approximations  derived  from  the  current  projections.  As  explained  in 
Section  5.3.  we  can  compute  the  action  of  the  projection  on  our  approximate  eigenvector 
exactly,  in  the  absence  of  roundoff  errors,  and  so  obtain  residual  vectors  (defined  below) 
and  their  norms.  We  snow'  below  how'  to  use  these  and  related  quantities  to  obtain  the 
dominant  term  in  a  power  series  expansion  for  the  two  eigenvalues  we  seek. 

Here  is  a  standard  result  from  perturbation  theory  for  non-  self-adjoint  matrices. 

Theorem  7.1.1  Let  X  be  a  simple  eigenvlaue  of  B  and  let  x  and  y’  be  any  right  and  left 
eigenvectors  for  A 


Bx  =  xX.  y“  B  =  Xy’ . 

For  small  enough  perturbations  E  there  is  an  eigenvalue  p  of  B  +  E  such  that 

p-X  =  ^  +  0(\\Ef). 
y'x 

It  is  customary  to  define  a  condition  number  for  A  by 

condi  X )  =  — ^ . 

y‘x 

In  our  application  we  compute,  via  the  Lanczos  process,  an  approximate  triple  {n.q.p") 
with  q  and  p  of  norm  1.  W'e  also  compute  residual  vectors 

r  =  Mq  -  qn,  $  —  M'p  —  pi:. 

Our  estimate  turns  on  the  following  result  we  established  in  [KPJ82]. 

Theorem  7.1.2  (■K.q^p")  is  an  exact  eigentriple  for  a  matrix  M  -  E  where 

E  =  rq~  -I-  pS~  —  pwq' 

with 

w  :=  s'q  =  p'r  =  p’Mq  —  p'qn. 

Our  trick  is  to  consider  M  =  {M  -  E)  +  E  as  a  perturbation  of  M  —  E.  If  £  is  small 
enough  then,  by  the  theorem  just  quoted,  there  is  an  eigenvalue  of  M  satisfying 

C  -  TT  =  p-Eqlp'q  +  0(\\Ef) 


but 


p'Eq  =  p'r  -I-  s‘ q  —  w  =  w 


and.  by  definition. 


p‘Eq/p‘q  =  p-T 
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where 

p  p{q,p’)  :=  p'Mq/p’q 
is  the  generalized  Rayleigh  quotient.  Moreover 

ll-E’ll  < 

=  Ikll  +  NI  +  k’l- 

All  quantities  in  the  expressions  given  above  are  computable.  \\  kta  ||£||  is  small  enough, 
we  can  safely  use  the  difference 

\p-t\ 

as  an  error  estimate  for  p  as  an  approximation  to  the  partition  function  per  spin.  We 
can  terminate  the  sequence  of  cycles  when  p  and  ~  agree  to  the  desired  number  of  decimal 
figures.  At  that  time  we  v./iLl  have  discovered  the  minimal  representation  (within  our  family) 
of  Mn  that  gives  the  required  accuracy. 

For  small  values  of  1.  ||£||  may  not  be  small  enough  for  the  linear  term  p  —  ~  to  dominate 
the  rest  of  the  error. 
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8  Numerical  results 


Here  are  the  results  from  a  preliminary  code  using  the  iionsymmetric  Laiiczos  algorithm. 
For  the  hardest  case,  ii  =  30  and  temperature  within  3%  of  critical,  it  took  about  20  seconds 
on  a  Sparc  station  to  obtain  the  partition  function  to  3  decimal  digits,  and  about  5  minutes 
to  obtain  5  decimal  digits.  In  the  tables  below.  GRQ  is  the  generalized  Rayleigh  quotient 
y" M„x/y'.r.  The  temperature  T  =  1.6  is  deep  within  the  ferromagnetic  region.  T  =  2.2  is 
within  '3%  of  the  critical  temperature. 


/  approximation 

GRQ 

dim 

time  (s) 

2  3.5189867614  (2.7  x  lO'**) 

3.5189822267  ( - 1 .8  x  10“® ) 

148 

0.8 

3  3.5189842995  (2.9  x  10-") 

3.5189837782  (-2.4  x  10“") 

232 

1.3 

4  3.5189X39756  (-3.8  x  10“® 

)  3.5189839519  (-6.2  x  10“®) 

352 

2.0 

Table  2:  Results  for  u  =  10.  B  = 

0.0001.  T  =  1.6  (true  eigenvalue 

=  3.5  J 

1X9840135) 

I  approximation 

GRQ 

dim 

time  (s) 

2  2.5925207946  (2.3  xlO'-*) 

2..59224075.33  (-5.1  x  10“") 

14S 

0.x 

3  2.5923360346  (4.4  x  10“") 

2.5921803640  (-1.1  x  10“^) 

232 

1.5 

4  2.5922660120  (-2.6  x  10“" 

)  2.5922266644  (-6.6  x  10“") 

352 

2.4 

Table  3:  Results  for  n  =  iO.  B  = 

0.0001.  T  =  2.2  (true  eigenvalue 

=  2.5922922453) 

1 

appro.ximation 

GRQ 

approximation  -  GRQ 

dim 

time  (s 

2 

3.51X9802741 

3.5189759878 

4.3  x  10“® 

688 

5.2 

3 

3.5189780552 

3.5189731775 

4.9  X  10“® 

1232 

8.3 

4 

3.51X9775223 

3.5189777525 

-2.3  X  10“' 

2192 

17.0 

5 

3.5189776100 

3.5189775601 

5.0  X  10“® 

3872 

35.7 

6 

3.5189776241 

3.5189776145 

9.6  X  10“-* 

6784 

71.7 

1 

3.5189776408 

3.5189777184 

-7.8  X  10“® 

11776 

132.0 

Table  4:  Results  for 

ti  =  20.  B  =  0.0001.  T  = 

1.6 

/ 

approximation 

GRQ 

approxiinatioii  -  GRQ 

dim 

tune  ( 

2 

2.5875164697 

2.5873011057 

2.2  X  lO"-* 

688 

6.8 

3 

2.5873559732 

2.5871852423 

1.7  X  10-* 

1232 

5.2 

4 

2.5872924943 

2.5872247888 

6.8  X  10-’ 

2192 

11.2 

5 

2.58688.50538 

2.5869016769 

-1.7  X  10-’ 

3872 

17.3 

6 

2.5872576018 

2.5872809894 

-2.3  X  10-’ 

6784 

80.9 

i 

2.. 58  724  7-5229 

2.5872981335 

-5.1  X  10-’ 

11776 

76.7 

Table 

5:  Results  for  » 

=  20.  B  =  0.0001.  T  = 

2.2 

/ 

ap])ro.\imation 

GRQ 

approximation  -  GRQ 

dim 

time  (s 

•) 

3.5189798036 

1.5189277829 

5.2  X  10-’ 

l(i28 

11.8 

3 

3.5187421095 

5.5187271685 

1.5  X  10-’ 

3032 

16.6 

4 

3.5189765962 

5.5189734194 

3.2  X  10-'’ 

5632 

.50.9 

5 

3.5189754869 

5.5189630814 

1.2  X  10-’ 

10432 

101.5 

(j 

3.5189767326 

5.5189765436 

1.9  X  10-' 

19264 

213.3 

1 

3.5189774.542 

5.518977.5232 

-6.9  X  10-* 

35456 

472.3 

Table  6:  Results  for  n 

=  .30.  B  =  0.0001.  T  = 

1.6 

I 

appro.ximation 

GRQ 

approximation  -  GRQ 

dim 

time  (s 

2 

2.58(558  7  7396 

2.5864247904 

1.6  X  10-* 

1628 

21.1 

3 

2.5864495960 

2.5863367635 

1.1  X  10-* 

3032 

17.5 

4 

2.58(539  8  9  3  89 

2..58(53409514 

5.8  X  10-’ 

5(532 

38.3 

5 

2.58(53  7  3  85  1  0 

2.586.34290.58 

3.1  X  10-’ 

10432 

64.3 

6 

2.5863633205 

2.5863620747 

1.2  X  10-'’ 

19264 

1.39.1 

i 

2.5863635130 

2..586.3831.549 

-2.0  X  10-’ 

35456 

316.5 

Table 

7;  Results  for  n 

=  .30.  B  =  0.0001.  T  = 

2.2 
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9  Comments  on  the  Ising  model 


The  model  arose  in  Statistical  Mechanics  and  consists  of  a  regular  grid  whose  vertices  are 
considered  to  be  "sites’  that  can  be  in  exactly  one  of  two  possible  states.  In  the  original 
version  [Isi25]  each  site  held  an  orientable  particle  that  could  have  its  spin  ^  parallel  to  the 
external  magnetic  field  =  +1)  or  antiparallel  (/x  =  —  1)  to  it.  Another  application  has 
//  =  +1  if  the  site  contains  an  atom  of  type  A  and  //  =  —  1  if  it  contains  an  atom  of  type 
B.  In  studying  gases  /x  =  +1  if  a  site  is  occupied  by  a  molecule  or  p  =  0  if  it  is  empty.  An 
exceUent  introduction  to  the  Ising  model  targeted  at  a  general  audience  is  [CipS7]. 

Early  work  focussed  on  ID  lattices  but  the  subject  really  came  to  life  in  1944  when 
Onsager  [Ons44]  derived  an  exact  closed  form  expression  for  the  partition  function  (see 
below)  for  an  infinite  2D  grid  with  no  external  magnetic  field.  This  expression  exhibited 
the  desired  singularity  that  signals  a  critical  temperature  Tc  at  which  a  phase  transition 
occurs.  Specifically  the  residual  magnetization  Mo(T)  that  remains  when  the  external 
magnetic  field  is  turned  off  is  positive  and  decreases  steadily  to  zero  as  T  —  Tc  from  below 
but  simply  vanishes  for  all  T  >  T^. 

Exact  solutions  for  nonzero  magnetic  fields  have  not  been  found  so  far  and  a  number  of 
researchers  have  turned  to  approximations.  There  are  two  main  approaches. 

The  combinatorial  method  uses  an  expansion  of  Zjv.  the  partition  function  for  N  sites, 
that  involves  for  its  r'**  term  the  total  number  of  subgraphs  in  an  A-node  graph  with 
exactly  r  edges  subject  to  certain  constraints.  Considerable  effort  has  gone  into  counting 
these  graphs  but  we  shall  say  no  more  on  this  topic.  See  [KacbS]  for  further  discussion. 

The  algebraic,  or  matrix  method  is  based  on  the  creation  of  a  matrix  whose  spectral 
radius  (the  largest  magnitude  among  the  eigenvalues)  yields  the  partition  function  per  spin 
[K\V41].  This  is  where  our  contribution  applies  and  we  now  turn  to  the  partition  function 
and  the  related  transfer  matrix.  The  construction  that  we  describe  will  not  yield  the  transfer 
matrix  that  we  have  used,  but  instead  produces  one  that  is  similar  'in  fact,  it  gives 
R„M^Rn)-  It  has  the  advantage,  however,  of  being  simpler  to  understand. 

Suppose  that  the  grid  contains  A  sites  and  is  subject  to  an  external  magnetic  field  of 

strength  B.  The  interaction  energy  associated  with  a  spin  configuration  /x  =  (^o . 

is  defined  by 

£(p)  =  -^ 

neighbors 

Here  each  /x,  =  ±1.  7  is  the  coupling  constant  giving  the  strength  of  the  spin-spin  inter¬ 
actions  and  g  is  the  magnetic  moment  of  each  spin.  Usually,  neighbors  is  interpreted  as 
nearest  neighbors  but  broader  definitions  are  possible. 

The  "partition  function  per  spin"  at  temperature  T  is  defined  by 

z{J.B,T)  =  [  Y. 

aU 

configurations 
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for  an  A'-site  grid  and  k  is  Boltzmann's  constant.  Several  quantities  of  physical  interest 
can  be  expressed  in  terms  of  c.  By  Boltzmann's  laAv  jg  probability 

of  occurence  of  configuration  fj  at  temperature  T.  The  free  energy  per  lattice  site  at 
temperature  T  is  — kT  Yog  z  ar  d  the  magnetization  per  spin  is  m  =  kT-^  Yog  z  [Tho79]. 
Theorists  sometimes  normalize  g  the  magnetic  moment  of  each  spin  to  be  1  and  for  that 
reason  we  have  omitted  it  as  an  explicit  argument  for  r. 

The  power  of  the  algebraic  approach  comes  from  the  introduction  of  a  matrix  whose 
dominant  eigenvalue  is  exactly  c„(  J,  B.T)  for  a  particular  semi-infinite  lattice  depending 
on  n.  We  indicate  briefl}'  how  this  may  be  done. 

Start  with  a  rectangular  grid  of  sites  with  n  rows  and  N/n  columns.  Let  Zj\j{J.B.T) 
denote  the  total  partition  function  (i.e.  )  for  this  grid.  There  are  several  matrices  that 

can  be  associated  with  tliis  situation,  some  symmetric,  others  not.  We  do  not  know  which 
will  prove  to  be  most  useful  but  describe  the  one  with  the  fewest  nonzero  entries,  the 
duo-diagonaJ  transfer  matrix  M„. 

In  order  to  remove  troublesome  boundary  conditions  the  sites  are  supposed  to  lie  evenly 
spaced  on  a  wire  wrapped  round  an  inner  tube  as  in  a  solenoid.  There  are  n  sites  per  turn 
and  the  last  site.  A'  -  1.  precedes  the  first  site.  0.  There  are  good  reasons  for  counting  like 
a  computer  scientist  since  we  can  now  say  that  we  have  a  chain  of  sites  of  period  A*  and 
the  nearest  neighbours  of  site  j  are  simply  sites  j  ±  l,j  ±  n  mod  A',  uniformly  for  aU  j. 
0  <  ;  <  A'. 

The  extreme  sparsity  of  comes  from  an  apparently  wasteful  redundancy  in  expressing 
the  partition  function.  For  the  moment  we  suppress  g.  J .  B.  T  and  let  Znij-fi)  denote  the 

partial  partition  function  over  sites  0.1.2 _ _  j  +  »  —  1  except  that  the  last  n  sites  are  fi.xed 

at  the  values  p  =  (p,o - ,/in-i)-  Now  add  just  one  more  site  and  observe,  in  detail,  how 

Znij  +  l.t')  relates  to  Using  the  explicit  form  of  E  given  above 

Zn{j+l.i>)=  H  !!••• 

>10=±1  4„_1=±1 

where,  using  the  Kronecker  delta  symbol. 


(M0  +  4n-l  ) 


with  d  =  gB/kT.  7  =  J/kT.  Since  Dm  and  (im+i  indicate  the  spin  at  the  same  site  the 
term  hardly  surprising.  This  is  the  redundancy  mentioned  above.  If  we  wrote 

the  2”  values  Zn(j-lJ-)  as  a  column  vector  Zj  we  would  have 

Zj+l  =  MnZj 


and  the  (D.g.)  entry  of  M„  is  m(D.fi).  The  careful  removal  of  boundary  conditions  has  made 
Mn  independent  of  j:  the  recurrence  has  constant  coefficients.  .After  N  applications  of 
we  have  covered  the  fuD  grid.  It  follows,  after  some  thought,  that  the  (i>.  D)  entry  of  is 
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exactly  the  contribution  to  Z„{X.  J.  B.  T)  when  a  fixed  pitch  (i.e.  turn  on  the  torus),  say 
the  first,  is  fixed  at  the  configuration  P.  So 

=  Zn(N.J.B,T) 

U 

=  trace  . 

Mr,  has  the  nice  property  of  being  a  nonnegative  irreducible  matrix  and  so.  by  a  theorem 
of  Perron,  has  a  positive  eigenvalue  Aj  called  the  Perron  root,  that  satisfies 

|Aj|  <  Ai.  j  >  1. 

By  standard  results  in  matrix  theory  and  analysis. 

Zr,{J.B.T]  =  (trace 

1=1 

—  Ai  as  A’  —  oc. 

Convergence  may  be  very  slow  but  since  the  limit  is  obtained  analytically  the  rate  does  not 
matter. 

In  order  to  produce  a  specific  matrix  A/„  one  must  specify  an  ordering  of  the  2"  config¬ 
urations  /j  that  one  pitch  (or  turn)  can  assume.  The  simple  mapping 

(11-11-1)  -  (11010)  -  2‘‘  +  2^  +  2^  =  26 


yields  the  following  duodiagonal  form,  as  illustrated  for  n  =  4: 


oo 


where  (with  appropriate  normalizations) 

a  =  6  =  and  c  =  gi-^-B)/T 

VVe  repeat  that  the  above  matrix  is  not  the  one  we  have  used,  but  is  similar  to  it. 
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A  P^i  is  similar  to 

This  appendix  uses  notation  that  is  developed  in  Sections  2,  3.  and  4. 

T.  Goddard  [God91]  has  shown  that  the  row  projection  matrix  P^i  is  diagonally  similar 
to  the  column  projection  matrix 

P„^j  =  DP^jD-^ 

for  some  diagonal  matrix  D  which  “commutes"  with  the  basis  matrix  A'/  (for  any  positive 
/  <  n),  i.e.  there  exists  a  diagonal  matrix  D  such  that 

DXi  =  XiD. 

In  this  appendix,  we  exhibit  such  a  diagonal  similarity  D.  and  show  that  it  has  the  required 
property. 

As  usual,  we  let  n  and  /  be  fixed  positive  integers  (/  <  n).  Recall  that  the  column 
projection  matrix  is 

fij  =  Ol'.VrM.A-, 

and  that  the  row  projection  matrix  is 

=  PJ.'A7(P„A/„"Pn)A-,. 

where  Dx  =  X'Xi  and  A"/  is  the  basis  matrix  whose  columns  are  vectors  in  Snj-  In  addition, 
the  transfer  matrix  Mn  and  RnM^Rn  are  special  cases  of  duodiagonal  matrices: 


where 

a  =  exp((2  -  P)/r),  6  =  e.xp(— P/J)  and  c  =  exp((— 2  -  P)/r). 

We  shall  also  identify  indices  i  and  j  with  bit  strings  of  length  n. 

Theorem  A. 1.3  Rr,M^Rn  is  diagonally  similar  to  A/„,  i.e.  there  exists  a  nonsingular  di¬ 
agonal  matrix 

D  =  diag(d{0) . d(2"  -  1)) 

.such  that 

RnM^Rn  =  DMnD~^ 

Prrjof.  Define  D  ~  diag(d(Q) . d(2"  -  1))  by; 

d(i)  =  exp(2{BK  -  t)/T]  (7) 
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where  k  is  the  1-bit  count  of  i.  r  is  the  bit  transition  count  of  and  ')/  and  are  the 
values  of  the  leading  and  trailing  bit  of  i  respectively.  The  (i.j)  entry  of  bMnD~^  is  given 
by  d{i)(Mn)i,jld{j).  So  we  need  to  verify  the  following  (see  observations  (a)  and  (b)  in 
Section  4.2  regarding  the  nonzero  entries  of  f ): 

(a)  for  even  j.  0  <  j  <  2"“^  -  1: 


d{2j)/d(j)  =  ala  =  1 

d(2j  +  l)/d(j)  =  a-^/b  =  exp((2£  -  2)IT) 

(b)  for  odd  j.  0  <  j  <  2"”^  -  1: 

d(2j)/d(j)=  b/a  =  exp{-2/T) 
d(2j  +  l)/d(j)  =  b-'^/b  =  exp(2£/r) 

(c)  for  even  j.  2”~^  <  j  <  2""'  -  1; 

d(2j]ld(j)  =  a/b  =  exp(2/T) 
d(2j  -I-  l)/d(i)  =  =  exp(25/r) 

(d)  for  odd  j.  2"~*  <  j  <  2"~^  -  1; 

d{2j)/d{j)  =  b/b  =  1 

d{2j  ^  l)ld{j)  =  b-^ /c  =  exp{(2  +  2B)/T) 

(e)  for  even  j.  2"~'  <  j  <  3  •  2"~^  -  1: 

d{2j)/d{j)  =  b/a-^  =  exp((2  -  2B)IT) 
d(2j  +  l)/d(j)  =  6-Vt"'  =  1 

(f )  for  odd  J.  2""'  <  J  S  ^  ■  2"’“^  —  1: 

d{2j)/d(j)  =  c/a~^  =  exp{-2B/T) 
d(2j  +  l)/d[j)  =  c~^  /b~^  =  exp{2/T) 

(g)  for  even  j.  .3  •  <  j  <  2"  -  1: 

d{‘2j]/d(j)=  b/b-^  =  exp{-2B/T) 
d(2j  +  l)/d{j}  =  b-^/c-^  =  exp(-2/r) 

( h )  for  odd  j.  3  •  2"~^  <  j  <  2"  —  1: 

d{2j)/d{j}=  c/b-^  =  exp((-2  -  2B)/T) 
d(2;  +  l)Mj)  =  c-'/c'^  =  1 


(8) 

(9) 


(10) 

(11) 


(12) 

(13) 


(14) 

do) 


(16) 

(17) 


(18) 

(19) 


(20) 

(21) 


(22) 

(23) 
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We  shall  verify  (8)  and  leave  the  remaining  verifications  to  the  reader.  Let  j  be  even, 
with  0  <  _;  <  2"“^  —  1.  and  let  a..’  be  the  n-bit  string  corresponding  to  j.  Then 

u,- =  0  o  0  Ou;(3)  o  •  •  •  Ok*;(n  —  1)  0  0,  (24) 

and  the  n-bit  string  corresponding  to  2j  is 

2^'  =  0  0^(3)  0  •  •  •  Ou^(n  -  1)  0  0  0  0.  (2.5' 

It  is  clear  from  (24)  and  (25)  that  j  and  2j  have  the  same  1-bit  count  and  the  same  bit 
transition  count.  So  from  (7),  d(j)  =  d(2j).  □ 

Proposition  A. 1.4  D  “commutes  with  Xj.  ue.  there  exists  a  nonsingular  diagonal  matrix 
D  €  such  that 

DXi  =  XtD. 

Thus 

D-\Xi  =  XiD-^ 


X'D  =  X'D-  =  {DXi)-  =  {XiD)’  =  D’Xf  =  DX". 

Proof.  It  suffices  to  show  that  for  each  column  of  A;,  the  nonzero  entries  in  it  are  multiplied 
by  the  same  constant  when  A/  is  premultiplied  by  D.  Since  the  nonzeros  in  each  column 
of  A’;  occur  in  indices  with  common  1-bit  count,  common  bit  transition  count  and  common 
trailing  (because  /  >  1)  and  hence  leading  bits  (Proposition  4.1.3).  and  the  entries  of  D  are 
characterized  by  these  parameters,  the  result  follows.  □ 

Corollary  A. 1.5  The  row  projection  matrix  P^j  is  similar  to  the  column  projection  matrix 
P^.i- 

Proof.  Using  the  notation  of  Theorem  A. 1.3  and  Propostion  A. 1.4. 

=  D^A7(R„M;«„)A-/ 

=  Dj;KXfbMr,D-\X, 

=  Dx^DXiM„X,D-^ 

=  D{DxKXfMnXi)D~^  since 
=  DP^jD-\  □ 
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